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Part I: Introduction 


11 The H IBOL Language: A Bret. Introduction 


| ‘The notion of the data diver’ op arises in connection in our work in the Very High ; 
. Level Language HIBOL and the automatic Programming system Fron ynent 0 that pee it. 
Although the concept. is of general interest outside o VHLL's and automatic ‘programming, we 
find it profitable to use HIBOL asa vehicle tof our discussion ane a means of narrowing the 
_ scope of our discussion. Therefore we first present a brief description of the domain which 


HIBOL treats. 


111 Flows 
The HIBOL language concerns a ‘elias but significant ‘subset of all data processing 
applications: batch ortented systems involving the: repetitive ‘processing of indexed records from- 
data. files. It provides ; a concise and: powerfut way of dealing with data aggregates: ‘HIBOL has ie . 
single data. type, the flow. This construct. is a (possibly' nidined)‘data aggregate and ‘yépresents a 
collection of uniform records that are individually and uniquely indexed by a peiles Saangonen . 
| index. ‘The components of a flow's index are called ‘keys and. the sr ehen index's keys is called its 
hey-tuple.' Each record has a single data field (datum) inv addition: tothe index information. 
(Real-world data aggregates, such as ‘Files;: with more than one datum: per logical + record are 


abstracted in HIBOL as separate flows, one fr rae 


' This term is historical. A more expressive term would be’ “key set", but that has historically 
been used to indicate the universe from which a key may take its values. 
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112 Flow Expressions 

Flow expressions can be formed through the ss aplepiale aa arnhmetie Opera men as "+" 
_ or “x” to flows. The meaning of such an application to ‘two flows is that the operation is applied to 
the data of corresponding records (those with Pane indie) of the argument flows. The resuk 
is a new flow, having a record for each matched eat for which the operation was performed: one 
index value of such 3 a record is identical to that of the matched pair, and the datum value is aeua 
resuk of the operation performed on the data of the pa This pe is concrete to an 
arbitrary number of flow arguments. | 

Flow expressions can atso be constructed using a conditional operator (similar to a “CASE” 
Statement) which evaluates logical expressions in terms of corresponding flow records in order’ to 
select and then compitie an expression as the individual. records of the flows ure processed. The 
| logical expressions are constructed using the arithmetic comparison operators “>", "=", and “<". In 
| addition the PRESENT operator may be. used: te test the presence.of a record in a flow for a given 
value of the index. of that flow. . These may. be composed using the logical connectives "AND", “OR” 
and “NOT”. 

_ Finally, there is a class. of reduction siesta baba liana flow expressions. The 
function of such an operator is to reduce a flow with.an n-key index to one with an m-key index, 
wwhicre sv <n; aed the key-tughe of thesniey tadex acide ofthe hey-tuiple of the n-key index.. 
All records of the argument flow that cortespond 30 a single record of the reset forma bet to which’ — 


a reduction operator (eg. “maximum", “sum*) can be applied to obtain a single value. 
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113 Flow Equations — 
Relationships between flows are are expressed-hy flew equations of: the form: 
<flow-name> IS <flow-expression> 
where Jiioisiaees is a named flow and <flou-expression> is a flow expression in terms of 


named flows. The right- and left-hand sides must have identical indices. - 


1.14 Example 
Consider a chain of stores whose items are supplied from a central warehouse: “Fhe collection — 
of store orders for itera restocking on a given day can = casi bade aS a flow called, say, 
; CURRENTORDER. A record of that flow contains the eantiy ordered by a particular store of a 
particular item. Each record has as its datum the quantity ordered and a 2-component index 
identifying the store making the order and the item ordered (the bes, of the index are a store-id 
and an item-id). Let BACKORDER be the name of a flow (of. siinitar -structure) representing the . 
‘collection of (quantities of) previous orders that could either not be. filled or filled only partially. 
The HIBOL statement 
DEMAND IS CURRENTORDER + BACKORDER | 
describes a new. flow DEMAND representing the total: demand: of each item by each store. That is, 
each ect in DEMAND contains a 2-component (item-id, store-id). index: identifying its ate which 
is the sum of the data for the same item and se in the CURRENTOROER and SACKORDER flows. 
The HIBOL Statement | 
ITEMDEMAND 1S THE SUM OF DEMAND -FOR EACH ITEN-ID 


illustrates the use of the reduction operator SUM. It describes a:new flow | TEMDEMAND representing 
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the total demand of each item from ail stores. That is, each of its records hhas a singte-component 
index (item-id) identifying a: particular item; and tes tdanie is the total quantity ini demand sommed 


across aff stores in the chain. 


1115 Additional iiformation . 
The computational sant al ; data processing sptem can be described by giving a full set of 
_ flow equations of the type shown above. To complete the system's description additiorial: data and 
tiening. information mest be given: | 


- for each flew, the: components of its index; the type of its data eer ee 
pendent with which acpccnaae 


“foreach hey ype 


for each pra iste reat tober prods 


12 Iteration Sets and prey HIBOL 

A flow. expression, as explained above, represents a set of records obtained by the record-by- 
record application of a formula to the records of the flows that appear as terms in the expression. 
In this paper we sha be interested in exactly for Which: index’ valies (and thus records) the 
indicated formula és applied. The set of these index vatoes is terined the tteritton set® 

The HIBOL ‘tnguage: is ‘rather. informal about’ specifying freration sets. It contains 


abundant provisions (through the use Of defaults) for implicit seviantics ‘based on the presence or 


the HIBOL flow 


absence of records in the flows appearing in flow expressions. For ¢xa 
_ expression 
QURRENTORDER + BACKERODER 


2 After Baron [1] 


Data Driven Lowos | 5 


_ describes. a flow that has a record for each imdéx value ‘for which either CURRENTORDER or 
BACKORDER (or both) has a secerd: ~ 
if both flows have a record for a given index value, the resukant flow has a record with the 
same index value, whose datum is the sum of those of the corresponding records in the two 
me flows; . 5 » ete: a s eetop aid Gy dail SP Aa We agi CR PAP Ses 


if only one flow has a record fer a given index valué, the resultant flow has a record with the 
same index. value and the same datum value, 


~ otherwise there is no record in the resultant flow. — sae 
One may of rs at bd semantics of addition in HIBOL, then, ee to.¢ convene r that the, operation 
+ is performed if and only if at least one of its operands is present and that each missing operand | 
is areated as if it were the additive identity (0). . Loki $e, cue, Hh. ee 

_ Although such conventions are convenient in writing HIBOL, for the sakes of Garky and 
_ rigor, we require fully explicit iteration set si elcrotaL eae aoe can be obtained through the 
thorough use of the HIBOL peimiiives Us and Merce Bae’ an fully explicit. form of the — 

Wee 


above HIBOL flow isl siesta would si 


CURRENTORDER + BACKORDER IF CURRENTOROER PRE NT 


ELSE CURRENTORDER en | 3 CURRENTOROER PRESENT | 


a 


ELSE BACKORDER . IF _ BACKORDER _ PRESENT . | | ee 
Here the index wales for. which the flow siaicuoat formula is to be applied have been eciae 
explicit by earicluting it as a three-clause conditional expression in terms of three. cade 
expresions each of whose iteration sets is specie bikes an Saas cena condition on the Presence of 
records in the flows involved. This is a legal HIBOL flow expression, aknough in view of the 


existing conventions it is overspecified (redundant). For our purposes we wit distinguish: 2. | 


; wee fsocd 3: 
er tnd. Y rest # EES well tipduean ad auley eon envy | fg wt mops 5 ovat pwall diod hi 


wa cain aiutel ood auicy xsbri sense 


a5 Oi eral he this mes tat we ow exgrnion in FESHEBOL eat ent 


trast) qeoliied by the 28 the fem tpt: Dhan cg ase 
| - -ayige cndeb senge off bas udev axbri rte’ 
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_ description is declarative in nature: it describes the relationships among the flows. An implemented 
data processing system is | procedural in route it, must describe. in. detail how. the flows are 
computed. The flow equations must be reinterpreted.as-basic computation steps (with an output 
flow and one or more flows as inputs) and constraints on the order in which: these computations 
can be performed (the computation. producing. a flow must: he. performed before any computations 
using that flow) must be made explicit. 

Design:* 

The implementation will make- use of files of data to be processed: by job steps which will in 
turn create other files. Each file will contain-the-information.represented ‘by one or more flows; 
each job step will perform the processing to satisfy one-or more-flow equations. The design of each 
file (information contained, organization, storage device, record sort erder) and’of each job step 
(equations implemented, loop structure, accessing methods used) should be made in such a way as 
to minimize some overall cost measure (eg. ala eal sa time used, number: of secondary 
Storage I/O events) for the execution of the data processing system, Typically this a shag cyan 
(behavioral) analysis of tentative design configurations | 
Code Generation: a | a 

The system's design must be coded in a supported high-level banguage so that it can be 


executed. 


1.4 Data Driven Loops 


Each flow equation represents.a computation whose implementation is essentially iterative in 


“ In ProtoSystem I the design process is performed by the Optimizing Designer modute. 
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any set of values for a pasticular. index an index set atid we distinguish two spécial kinds of index 
sets:® a = id wears | 
The set of index values for which a flow F ‘contains a record is called the index set of F- 
(denoted IS(F)). | 
The set of index values for which. an. input:flow F;:contains a: fecord ‘that wit be used in 
generating a record of the output flow F the eritisal: index iset of Fj with respect to F (denoted 
C1S¢ (F,)). 
These two, should sick be Gobained: C5, (Ep: for some:fiow F will often ‘be ‘a proper subset of 
IS(F).” | 
The problem we face is that of finding some way of enumeratitig the critical index sets of 
each input so. that loop..can be: properly driven : ih:48 penerally fifipractical to use the set of all . 
possible (legal) index values for which an input might Kove wretord: -For one thing this set may 
be unbounded. Even if it is finite and enumerable, it will often be much larger than the critical 
_ index sé and thus grossly inefficient. In the DEMAND flow equation example given above, for 
instance, the critical index set of the input flow CURRENTORDER is likely to be orders of magnitude 
"smaller than its maximum possible ‘size (the case where every store has orders for every item). 
A much more efficient way of ucanceatiieg a set of index values that is assured to cover the 
critical index sets of the inputs is to use the union of the index sets of the input flows. This will 


work because a record of the output can be produced only if there is some input flow in which that 


© Unfortunately, this terminology is at variance with that used by Baron in his. thesis [1] 
Baron uses the term "critical index set" to mean what we call the “index set”. 

7 On no account, of course, can it be other than a proper or improper subset of 1S (F,). 

8 This statement is somewhat oversimplified, but it will suffice for now. A fully precise 
statement of the problem is given by the Fundamental Driving Constraint in Part IV. . 
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Part Ik: Structure of Data Driven Loops 
Before a general treatment of data driven doops ‘tan be developed ‘it is Wecessary to examine 
the structures of the loops encountered in the HIBOL system. We begin’ by presenting a taxonomy 


of computation types:and their corresponding loop tmplenwiatations. =~ 


11.1 Loop Terminology 
Before discussing loap structures it is useful-to establish some terminology. By the term loop 
we mean a control construct which somehow enumerates a:set of values for , loofrindex aind which 
performs a fixed sequence of statements (its body), once for ‘each value of the toop-index. A loop . 
may contain one or more loops within its body. The inner toops are said to be nested within the 
outer (enclosing) loop and the structure as a whole Is called a nested loop structure. Each enclosure 
defines a different level of the nested ‘ie structure. ‘The degenerate case of a nested loop structure, 
where there is no loop. in the body of the outer toop, Is cafted a singlé-level loop, since there is only 
one loop tevel. 
A. totally nested loop is a nested loop structure whose-component loops are totally otdered . , 


under enclosure (ie. for any two loops L, and Lz either L, is inside Ly or Lp ts‘inside L,). 


- 11.2 Kinds of Computations and Their Loops 
Each run (computation, job step, program) in the implementation produced for a HIBOL 


description of a data processing system is essentially a loop that iterates over the records of its input 
files to generate records of its output file(s). The structure of this loop depends on the nature of the 
computation being performed. We will begin with computations that directly implement single 
HIBOL flow equations of various types. Then we will consider computations that implement more 


than one flow equation (aggregated computations) simultaneously. 


ont OPE A is eerily 2 <shigte Mow. 


Ts BN aera rgenate l  oranmpnr Age nig OF Het at 
‘Computation is described by the HUMG2L. Maveaguetipnl geiboogiere Wedd bag 2sqys colisiuains2 Ie 


PAY IS  —-- HOURS. «3.68 


“T xan & EUS 
shia 4 ih t48 


Ths, 257 sea here da; flo 
as 5, datum, seen hin ir aga 
ee ee saa 
loop having. a single quartile vigieh 6 tin rire. Somewh deeatndieioene ae a Bhainpee' 
fs read and weed te provide both a dane and index sue Sr thn Sy, ch cafe: 
erm. of se peraeents ea 


& % ; , g xi peer . sailed“! 


: vahee, and sepigyts 2; Tee Ho J saan pd pts es i tne, econ get vane yet edb agers 353 Vapr 


Sn the SEAL fangunge Gee Apgantie Sth wage took Wig: 


rgeot thAT hen isoiint tug ia chaia $44 


S Z.:6) foowheag HOUR; TS a adi -d frergory qete dol aaitetuqmes: eet mesg 
york? a aril ys¥a caine) ject cod y vilenaacts.2: moieye peter alel s te HOE 28 


= 


S7i) 3G atlihias Bat fs of pad te seen set Y apd teqive aii de thicoot te ensa 0! He 


rqenon kw nigsd Tw oW Lbsptiotieg yaied atl nage: 


- Le Se rbet nog ots isis ieee ; ee el gipets Sey ip 
Sayy PRET ES inti Leis tne Sa FRC PEYE HRRYEY LS SHISNp yA LE 


glenn Hirmag LieisKgias Beth, poisgs) noweups KOR one war 


Data Driven: Loops 13 


for each (employee-id) from HOURS 
get ‘HOURS (emp! oyee- id} 
. PAY (employee-id) « 


ifdef ined [HOURS (emp loyee-id)] - 
and eo eee te) > 48) 


then HOURS (employee id) x 3.8. 

else if det ined HOURS exp lovee) 
then 128.8 + (HOURS Comp loyen-ia) - 40) * iy 5 
else undetined 


if defined [PAY (emp! oyee-id)) 
“then urite PAY (emp!|oyee-id) 


end 
The for-end ic boniaraed represents the aan iteration over values of ine index enployee-id It 
specifies that the values for the index a are obtained saat the nous flow. For ‘ath index value, ane 
correnpondiiig record of HOURS is read, the corresponding secoed of PAY is ‘generated, and Me 


generation was succesull hat record is written out. Notice that the PAY cakulation is a direct 
translation from the HIBOL Tw tion: 


For reasons of exposition the ios irighernehtarion presented — is of the Mea! general form. 


a 


An actual implementation would incorporate various eficlency “enhancig improvements? _ 


Nevertheless, we shall contin: ‘to use: such forms t to ee explicitly where 1/O and ae occur 


ros 


_ conceptually. 


® For instance, since the for has to read the next record of the driver to get the current index 
value, the. get could. be. omitted. .Furthermore, the defined tests dn the PAY calculation couki be 
omitted since they are testing the presence of record which must be present. Finally, in this. 
computation, the check before output could also be omitted. 


22 Matching Computations 

A matching computation computes a non-reduction flew. expression involving two or more 
flows. Thus it is similar to a simple computation, but instead of operating on a-single record of a. 
single input flow to produce an output record, it pone on ase of correiponding records, one 
from each input flow. Correspondence is established by common index. vahis The name 
“matching computations” derives from the necesity of matching “P the records of the inputs by | 


index vahes before they cam be operated on. t 


Two sub-classes of matching computations can be pelaemangtenes depending on whether all of 


the inputs have indices with identical key-tuples or net. 


TE.2.2.1 Expressions involving Flows with a Uniform Index 
Consider the a pay cakulation similar to that given ee bet where pers are paid 


various s hourly rates. Let RATE be a flow, indexed by (employe, each of whose records has as 


its datum the hourly pay rate forthe employe inated by its index ‘vabes, pow 


estas Gceoines 
PAY IS HOURS » RATE “ E MOURS PRESENT 
ABCA PRESENT 
AND NOT HOURS > 48 
ELSE 
RATE « 48 + ae ake 
(HOURS — 48) 2 1.5 « RATE IF HOURS PRESENT 


(ID PORE PRESET. 
HOURS and RATE have identical indices, cach consiating of the single key “employee-id”. The toop 
that implements soch 2 computation has 2 mere. 


‘eco end othe op i ply shee 2 vcd i he HERS that 
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file alone is sufficient to drive the loop, (Alternatively, by similar. reasoning, the RATE file could be 
used to drive the loop.) This ithe smipked case of a nial: computation because only one 
| input is. needed to drive the loop. (The computation of the flow s above is also of this type.) On . 
each iteration the next + record of the HOURS file is read, the cnrepending RATE record is sa Mais 
. and the computation of gross pay performed. 
This loop is represented in the SEAL language thus: 
' for each femployee-id) from HOURS 
get HOURS (emp! oyee-id) 
get RATE (emp! oyee-id) 
PAY temptoyee-id) = 
if def imed [HOURS (emp! oyee-id)] 
and defined [RATE (emp! oyee-id)) 
and not (HOURS (empioyee-id) > 48) 
then HOURS (emp ‘suede - 2% RATE (emp loyee- id) 


else if def ined THOURS (emp | oyee-id) } 
. .and defined [RATE lemp! qyee-id)}: 


_ then RATE femptoyee-id) « 48 + . 
(HOURS (employee) - 48) « RATE lemployee-id) * 1.5— 


else undefined - 


if def ined [PAY (emp | oyee-id)] 
then write PAY (eaptiiecs id) 


end 
Agni the defined checks on the driver, HOURS, are supertnus But those on RATE, are nereeaty, 
(to determine whether the corresponding get was deccosfil and the ‘defined 4 check on PAY is 
necessary (so that a record is written if and only if a datum: died ieee - 


Now consider the HIBOL flow equation for the DEMAND flow given ahve: 


a ; Pant avegh of ose 
gis? yl peusoot MUPSTOGMS Qtcen & 


“ELSE OAONER 


Jonesy ob Broce? TTAB gribnogesvies sii? beat ce 9h! tO? 


Seon! 
inputs are necessary te drive the ap. BAO mort (bi-esuotqae! fase 7? 


for a given index while the others de net."? W the deivers have shpir rovords. saved, tp the same 
ordex (say, alphebetically) by index wae Se pny pet SEI SEW 
se 


(ibi-saye igual DARIOHI berti teh ti 
{é fhi-seyol qas! STAR banideb Ll eretey 


° ©. Read the first recerd of each input {Gs < tbi-esuolams! CRUBH) jon be 


L Use the smallest tadiex aie ic he 
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2 Discard all detver recends | 
ceery drives whee recosd was dimanied. 
+ 84 2 tp: ~asyel que? 31 Ah rerti 


3 Repepe fda se<.:}ceta) STAR « (B8 - fesuol gee) ZPD 


TO 1 the ty sting comaroet on inp tat We St inthe sme eer 


me comstreints om the onder af the veasnds in the other Mipeily hin 
Trsecgs tenet ati we nad} 


ay Oy mecca tay des Rig bp mms wea ce mated rm its, records, oedscm 
etae2soae ate STAR no stort wd apcrd pe 578 EH wink sa ne 2s phon ts si hae 
be fetched by sequential reading. whith ix guncrally save efficient. : PO ee 
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These details are implicit in the SEAL representation of the foop which is simply: 
| for each litem-id, store-id) from CURRENTORDER, BACKORDER 
get CURRENTORDER (i tem-id, store-id) - 
get BACKORDER(item-id, store-id) 
DEMAND (i tem-id, store-fd). = cas 


if defined (DEMANO{item-id, store-id)} 
then write DEMANO(item-id, store-id) 


end 


. 11,2.2.2 General Discussion of . 


‘The treatment of mixed-index flow expressions in this paper will be restricted to those that 
are legal in HIBOL. The restrictions that HIBOL imposes are made for good reasons. A brief 
discussion of the various conceivable types of mixed- pe flow ew is presented here in 
order to show the motivation behind these restrictions. | | 

The various cases where the flows ina flow expreson wae meted indices (ie. their indices 
have different key- ae can be dining by the set Iterefaonsips among the key UPI 

"Consider the case where flows have disjoint sh sa leg. (w, x) and \y, z)). 7 
aie co among records of such flows is meanings so we do not allow them to Sapeen in 
: the same flow expression. ; | — | | 

Now consider the more general case where there is intersection saan index key-tuples, but _ 
the union of their pair-wise. intersections is not identical to their (simple) union. In this case 
correspondence is always ambiguous. For example, consider the two flows: A with index (x, y) and 


-B with index (y, z). Suppose that there are records in A for the particular index values (x;, ¥;) and 


(xz, y,) and that there are records on 8 for index watues 4y;, 24). (y}.'t2) and (y,, tg). Which of A's 
records correspond te, which of B's reconds?** | | 

For correspondence to be meaningful and wnarebigeous it mast be the case that the union of 
the pair-wise intersections of the key-tuples of the indices invelved is identical to their union. This 
is atways ete when there exists an index among-the flows involved whose key-tuple is a 
superset of all the key-tuples of the other flows. 

To be sure, there are other ways of satisfying the condition of the preceding paragraph. 
These involve conjunctions of three or more indices. Consider, for instance, the three flows: A with 
index (x, y); B with index {y, z), and C with idex‘(x, 2). Corresponding triplets are all unique and 
unambiguous, of the form (x, ¥); (yp 4). (x, 4) For-the take of simplicy, however, this case is 


prohibited in HIBOL. 


11223 Mixed-Index Flow Expressions Aflowed in HIBOL | 

It is possible in HIBpE to apply operators to two or more flows having different indices as 
are re cock index is a sbrinder ofthe index of some unique flow invoived (Le. a lng as the 
key-tupl of each index is a subset of the ale of the index of Mead unique flow). Clearly, the 
index of this unique Now is identical to the index of the flow expresion asa } whole. HIBOL 
afows a mixed-index flow expression ony if its computation caf be driven by the set of those flows 


involved having indices identical to that of the flow expression. 


12 Of course, we could allow al/ pairs to match (in Cartesian’ product: fashion) so that the | 
expression A + B would represent the six possible combinations of additions for these 5 index 
values; but this would change (extend) the semantics of HIBOL. 
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For example, suppose we want-to cakulate the extended prices!? of the ‘current: store ofders 
(the flow CURRENTORDER) in our store chain example.-- bet PRIEE™be a flow. indexed “by (item-id), 
each of whose records has as its datum the per-item price associated with eines seer adereiiied by its 
index. The flow equation for EXTENDEDPRICE, indexed by (item-id, sored) would be pe expressed in 
HIBOL thus: . | eee tga 


EXTENDEDPRICE 1S CURRENTORDER « PRICE IF. . CURRENTORDER PRESENT 
: AND PRICE PRESENT 


city ey Fe 
4 


| The intent here is: for every record in CURRENTORDER: find.the hacvsheasdiaginaien in 1 PRICE and, 

_if the fatter is present, multiply their respective data to calculate the datum of a corresponding 
record in EXTENOEDPRICE. Notice that because PRICE and -CURRENEGRDER fave different indices 
((item-id) and (item-id, store-id), respectively) the notion of correspendehce mast be extended ina ~ 
natural way from. pure identity of index valves:- We comvene that-for #:particulat’ valae of hem-id 
the index (item-id) matches any index (item-id, store-id)-witl the same value UE hem+id; regardtess °° 
of the value of store-id. This augmented definition of correspondence is extended to' the generat 
case. where. the “key-tuple of ane index is-a subset’ of te: hey-teple of ehother. That’ is, for given 
values, of kj, «1: ky the index (Kj, ... hg) is said-to-thatch-awyp:instance’of'an itidex i, oi'kes Kelis os 
k,) with the same values of k;,’..., kes regardiess.of the values of ky, 4, i. ky. 

Since a set of input :flows; each with index identical to the Now <apiecont’ can be used to 
drive a. mixed-index matching computation;-its implementations ‘sivnltai to that for a uniform 
index matching computation: the sorted dtivers are read iW:suich ‘a way_-2s to entimerate the ctittcal | 
index sets of all of the input flows; the resulting index values.are used to fetch records from the rest’ 
of the inputs (including all these whose indices are sub-indices of the flow expression’s index). 


'3. The extended price of a quantity ordered is the product of the’ quantity atid: the een 
price. 


_ = hibe general Seem SEAL teaplepentntian-ins cinghefencbiamge F< cu EO rae 
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_ teas (eer jd, stores) site dodens) Etenindon tel thventhee Sajal, PRISE: th ghevitiaeiertd).' te me’ > 

reverds of beth CURRENTORO are prety evel se ORION renee Bi 6 fe 


each, $s sepa nogelr tree ec twee PRS tins puualec vepeed wt PHOEE 10° 
" Mable.to. be read. more, thap.amee. bes.) or. case wcbet geituest orth zwoll faye mn to fig #2 oe vena 
In the, cage where, the enarda: of, CURREMTIRIGA eneiserted tog deste Bik the: EXTENDRDPRI CE ' * 


Data-Driven Loops - 21 


Basically, the outer loop chooses a value of the sub-dndiex (ited) and fetches the corresponding 
PRICE record. Then it performs the sane loop. Within the inner ‘loop the value.of the item-id key 
is held constant. All corresponding: records of “CURRENTORDER are read’ and the computation 
described in the flow equation is performed: using the data of these records together with the datum 
of the PRICE record fetched in the outer loop. : The: resultsare!usedto Buitd’ and output the 
‘otresponding records of EXTENDEGPRIEE. This process: repented unt iad flows are exhausted. 

In detail the Clppkementatton is as follows. Beore ether ee. Ls 5 entered a record of 
CURRENTORDER is read The outer loop uses this record to obtain ine first value of the sub-index 
(item-id) and fetches the corresponding record from PRICE. Then it performs the inner loop. The 
inner loop uses the current record of CURRENTORDER and::tontinwes to read records sequentially 
from CURRENTORDER until the sub-index is obsetved te’change or an end-af-file ‘condition occurs. 
When either of these conditions occurs, it exits to.the outer :loop. If an-eof has occurred, the outer — 
loop exits. . Otherwise it iterates, using the sub-index value of the current CURRENTORDER recotd as 
the new value to be held constant in. the inner loop, fetching the cotresponding PRICE secaei and 
performing the inner loop again. om : oe eat 


The corresponding SEAL code is: 


for each lites-id) from CURRENTIOMIER. 
get PRICE ti ten-id? 
far each (store-idi from. GUARENTORDER(: ten-id? . 
get CURREWIORUER i tes-id, atore-idh = Cyn 
EXTENDEDPRICE (iter-id, stere-idd: = 


if- def ined {(CURRENTORDERL: ten-id, stere-idh} 
and def ined (PRICE (i tem-ic? 


then CURRENTORDER (i ten-id, store-id) * ¢ PRICE (i tom- ta) 
else undefined 


if defined {EXTENDEUPRICE {itew-id, store-id)] 
then. ur ite EXTENDEDPRICE Ui tem-id, store-id) 


Notice that the outer loop is driven by CURRENTORDER {the whole flow); tut that the inner loop ts 
"driven by CURRENTORDER (i tew-id) (the sub-flow of CRRENTORDER coteisting of jest those records 
whose indices correspond to the value of the sub-inden: (stem-id) fixed by the oister foop). What — 
this. means is. that for the outer Joop the next value. of the sut-index (iters-id) wit be taken from” 
“the next record of the CURRENTORDER flow. But for the inner loop the next valie for the subindex ” 
(store-id) will be taken from the next record of the sub-flow of CUNRENTORDER corresponding to the 
current value of (item-id), if there are no further records in CURRENTORDER for this fixed value of 
(item-id) this will be treated just like an end-of-file condition and the iteration of the inner loop 
witl terminate. Thus the inner loop is driven by a succession of sub-flows, one for each eration of 
"Se wie: vested Soc tmgplemsenns thon sachonie’ a waidly eiacaded ts 4 Ge dance Soop Seals Wola 


appropriate sorting constraints hold among the flows involved. For example, suppose that there 
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are 3 flows involved: A with index (kj, kz, kg); B with index (k,, ko) and C with ‘index (k;). “And 
suppose further that B is iecr ay k, and that A is sorted first by k; and, within ‘Segments 
corresponding to a fixed value of ky, the records of A are further Sorted’ by k». “Ttren the flow 
equation can be scaplenued ig a nested loop structure involving 3 lodps Aihniermost loop, 
middle loop and ene loop). The outermost loop.choeses“a Wale ‘for the key Kk, to be held 
constant within the middle loop (and perforce in the aie loop, whitch is contained in the 
middie loop). It also fetches the corresponding record of C for “tie with ttie ‘contained loops. 
Then it executes the middle loop, which, in turn, choose a‘vatee for thé ‘Key ‘kg to be held Scie 
within the inner loop. The middle loop also fetches the serresponaine record of B for use within 
the innermost loop. Then it executes the innermost loop. In the innermost loop the valaes‘of the 
keys k, and kz are held constant. The innermost loop reads all corresponding records of A, ustitg 
their data and those of the already read records to perform the caeeetens eon in ihe fom, " 
equation and to bul and na a the oe of the oom tow. When the innermost foop, has 


read and processed all records of A coresponding to the fined values of hye and ke it ais to the . 


YEG 


middle loop, which chooses a a new w value for rhea and iterates. When the middte a: has exhausted ; 


: att 
be phi pe? end gee a es 


all possibilities for the value of k, fixed in it, itr returns to the outermost s0°R: oneh chose? a new 


g: 


value of k, and iterates. This ies structure expres in the SEAL Language looks oe 2 
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treuieat as a single flow. 
_ Gonceptuatty, the argument flow: is partitioned into ‘ube (sub-flows) by an equivalence | 

relation defitiéd on the sub-index (a key’ OrKeys)’itididated in the FOREACH clause; then the: 
retiuction operatér 18 applied'to the riiénibert'of each subset to generate the value of the datum of 
the output’ record cotresponding to that subset. For instance. in the first example given above the 
DEMAND flow is conteptuafly partitioned into record subsets by item-id. “Thus, all records in DEMAND 
whose index ee the value item-id, for the item-id key are in one subset, all records for steed 
= item-id2 are ii another, and so forth (empty subsets afe ignored). The datum for the record in . 
I TEMDEMANO with: index ='¢item-id) #8:caledtated by suinming all of the data in the fecords: i the 
subset corresponding to item-id « item-id, ee : | 

Conceptually, the implementing iteration for a simple reduction expression in a single flaw 
consists of two loops, one nested inside‘the other. ‘The Thner Toop implements the application of the 
indicated reduction operation to a subset of the-input’s recofds: Within this loop the value of the | 
witli defining the subset is held constant. Returning to the Sun OF DEMAND example, the 
inner loop implements the summation of the via of ihe records ah each subset of DEMAND. That is, 
the inner toop is performed for each value of item-id, for which there are records in DENAND. 
Within the inner loop the porune value of the key itenrtd is held constant, all records of DEMAND , 
is a to that key value are fetched and their data are e sumed. | 

The outer me performs clerical work. It chooses a value the subsetting sub-index (eg. a 
‘value of item- id), executes the inner loop (which fetches records of the input corresponding to the 
chosen sub-index and, for i ai adds rem to the’ accumulator), and when the inner loop is 
finished, it uses the resulting value as the datum of the output record corresponding to the chosen 


sisanice, and writes that record out. 


arti nett moaned of he ingat & rend. ) ia wae 
the sab-index 
Wg ease my raaee Meee BSS — 


recerds 


adi evoGR G3VER Foor Grised vee PLO re etn terete mien: 


TRANG Ato bre08: US 26 pig! z 


oe rene, 2m eo te 
*, bio: it x0! tr ont os eet ar ae Maree mmr habe - 
- Sait i agcernsge ir ewig Spey “vey! 


for each liten-idt fron SEND 


bine = bi-rmedl of geifrioqes ts Teeoue 


m= tined oe 
wel edt. BP RONeeTtEEs Mel oube afeeiz 2 to} sisted giiimaemsiqa ont yliauiys a7.) 


fer each ister,e-id} 


igh Be t gait eral gun 


: bein ane aon! owl OG zutene> 


@ it 7.2 Sug; ay aA 3 et, Tus ; Sauwes a feedua 5 Gf agitgia Geass fate heii 


. 


get dug 


Float D OWee Wl te caduz dare to ebrooy sit cet od} Ye notieewaue sdt cosarsqr i gool rena 

wm 4 iten-id, stere-id) 

PAE TRG AR chaos: o36 a ae wt Biman Je siday dose wD Derminiivg Loe ier ot 
det ined EU ten-id, etere-id?] 

{ved oxi Ye sulev teieaitisg ori quot vency sii fA: Fy 


bacemue o18 alah vied? bas bedor? oe sulev god tent ol obi yes Te 
etse undat ined 


tw 


PAROS ha eSnere: is Anetnns Died oh brmgy 
then 


£38) sehinidpe opatiadin af) aefey a qwoods HO dyow Te EMM? smrotisy quoi ive 3AT 


writ 


tisectie seb ul eae eyze tis Pod SQ adt iia met b si is suiey gaides: ets eazy 2 86 


Data Driven Loops - 27 


It may at first seem unnecessarily baroque to inet the Actumutator sum to “undefined” in the :, 
outer loop, test it in the inner loop for definedness and then invent ‘tif undefined. In this 
simple example we could jus tralize it to 0 in the outer chee and not t bother with the detinedness . 
checks.. We have chosen the sonia course for two reasons. First, we wish to make explicit the 
conditions under which the sum (and thus a record of the output I TEMDEMAND) 48 defined for a 
given value of the key .item-id: Second, a: tittle thought wil show ‘that for other reduction 
operations (vit, MAX and HEN). initiatization. of the’ accumetator’ must (at least conceptually) be 
postponed until the inner loop where the initializing value is obtained by the first: get:° Moreover, 
in general, when computations are aggregated (see: below} and more than one activity is performed 
in the inner loop, it is then possible (if some driver besides DEMAND is used) that for some values of 
item-id no sum is calculated in the ipner-loop and serum extt from that loop. 

If the input flow is not sorted as above, the ‘computation for a reduction operation becomes _ 
‘somewhat more complex. Gane possibility és:to. create and maithin spatate’ a¢cumulators for each © 
value of the sub-index value occurring in the input flow. Since the number of accumulators cannot 
be known a. priori (ig. at-compite time), storage lfor-enely thast ‘be atbitited on the My: (during 
execution of the eal a In PL/I, for al a ed ania vecgmy eater) scheme might 
be used: ads 


Declare an accumulator array to have CONTROLLED sinc 


Make a pre-pass asa the ent flew to count the umber sal different PUD INCEK:: 
values occurring. ae 


Execute an ALLOCATE statement to define the site stare oi 
Make a second pass over the input flow to perform the accumutation. 
_. Write all accumulated values out to the oiltput flow. 
In this scheme there are two separate loops instead of a totally nested mop structure. 


Alternatively, a nested loop, multi-pass scheme could be baplementes: The outer ico would 
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If.3.1 Formal Representation of Nested Loop St isc] a 


me. 


We have seen that the basic control sructure. sed in Impementing a computation is the 
totally nested loop. Associated with each loop in ‘the nesting is a set of keys that it will fix and 
which will remain constant in the loops it contains. : is ae to see that this connie means that 
the set of keys fixed within any loop is necessary a (rope) superset of the set of keys fixed within 
any of its enclosing loops. Thus, the : set of keys fixed within a toop is sufficient to determine its 
level in the nesting. 

Now notice that the body of every loop ieacees the iaosrion one) contains exactly one top- 
level loop; thus, the body is naturally divided into thtee:parts : 

the prolog--those actions performed before the cost es 

the enclosed loop 

the epilog--those actions performed after the enclosed loop. 

Conceptually, then, a totally nested oop can be iepleaned as a list of loop descriptions, one 
for each of the component loops. Each such description would consist of a level identifier 
(indicating at which level of nesting. it coats) and the prolog and the epg. However, during the 
design ne while implementations are me deep int in Jeong when _computation 
aggregations a are being considered, it is ; useful to distinguish 3 ‘panes bd actions within the ard of 
Prolog--those actions that must be performed before the enclosed loop 
Epi !og--those actions that must be performed after the enclosed 
_ Genera!--those actions that could end up in either the-prolag or the epiiog 
It is also useful to separate 1/Q. actions from the other actions. - Thus, we répresent each hoop 


in the nesting as a structure of the following form:'> 


'5 ‘This representation, and the theory of computation aggregation associated with it are due 
largely to the work of R. C. Fleischer [2], who improved on the earlier work of R..V. Baron. 
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(Level, a 
(Inputsp, Proiog, Outputsp) 
tinputeg, General ,..dutputeg) 
(Inputss, Epifog, Gutputs,)) 
where | | 
Level indicates the depth of the sic in the nesting 
Input sp are the files (necessarily) read in the Proton s section. 
Inputsg are the files (necessarily) read in nthe Conarel section. 
Inputs, are the files (necessarily) read in the Epi tog section. 
Outputsp are the outputs generated. in the Prot og:section (possibly used in the enclosed loop 
or in the Epi log section) 
Outputsg are the outputs generated in the General section. 


Outputs, are the outputs generated in the Epi log section. 


. 113.2 Computation Implementation 
‘The implementation of 4 computation at a nested lop structure reduces to the problem of . 
determining how many and which levels are to be in the aonay rested oop and hig me shed and 
computations go. The answers to these questions are constrained by the forces of caceuey and 


efficiency. 


33.2.1 Level Position of. Qa 
__ The levels at which. cach input showld be read, cach: output should be written and each 

calculation should be performed are determined by the following guidelines: 

Inputs: Each input jeers computation should be read at a loop level whose. associated 
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key-tuple, is identical to that of the flow's index (and on this.account the totally nested boop for a 
computation must contain a loop corresponding to the index ef.each input siew),. it-cannet be retid 
a higher level because at such a level the, key information 4s.incomplete.. Torread it at = tower level 
would be inefficient, because it would couse sonesexasy re-reads of the flow's records. <5 

Outputs: _Similarly, each gutput. flow. of a compitation; faust be written. at ailoop level whose 
associated key-tuple is identical to that. of the flow's.index.- it capmot-be written ‘at: a higher tévet: 
because of insufficient key information, and. fo output jt ata ;tower. devel would ‘cause mutiple writes 


of the records. , - Aisi u 


_ Calculations; A flow. expression should. alto -he caleulated: at a: 4oop: fevet-whose associated 
key-tuple is identical to that of the flow expression's-indax. Again, the bey information at whigher: 
~ tevel would be insufficient to calculate the expression, ag tn. perform: tt ata lower ‘level: would ‘be 
redundant. Further economy can be realized, however, in. a.mixed-index. flow. expression if it 
contains a sub-expression whose associated index is a.sub-index. of-the flow expression as a. whole; 
such a sub-expression should. be split off and calculated. at us appropriate kbigher)-tevel 


LE 3.2.2 Position of 1/O and Cakulations Within Their Astgned © Levels 


ee * ye 


The placement of a read, write or calculation within. a sve loop level (ie in either the 
PEE ley, Epi log or General section) should be done with a ‘view toward imposing the cue 
constraint on sari heniation: If done in this manner placement p preserve the mnaieieant | Reb 
| in subsequent aggregation. For instance, if a calculation could go into chher the Prolog or the 
Epilog it should be placed in the General section. If instead it were arbitrarily placed .in: the ° | 
Epi log this unnecessary constraint would preclude subsequent aggregations that would reqiiee’i to 


be in the Prolog (loop merging in computation aggregation is discussed below). 
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PAY 1S RATE * HOURS IF RATE PRESENT ANO HOURS PRESENT 
Here, both inputs have the same index (employee-id) so there is only one loop: 


Level: (employee-id) 
Inputsp: empty 
Prolog: empty 
Output spempty 


Inputsg: {HOURS, RATE! 
General:calculate PAY 
Output sgAPAY} 


Inputs_: empty 
Epi log: empty: 
Outputsgempty . 


As explained above, everything is placed in the general. sections. 
Now consider a_simple reduction flow equation: 
TTEMDEMANO 1S. THE. SUM OF DEMAND FOR GAGH: I TEH-1D 
We have seen that the implementation of such a flow equation will: always have two loop levels: 
i Level: (item-id) 
Inputsp: empty 


Prolog: initialize sum 
Outputspempty 


Inputs¢: empty 
General empty 
Outputsgempty 


Inputses empty 
Epi log: empty | 
Output s_:f 1 TEMDEMAND} 
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Loop | (outer loop) 


Level: (item-id) 
Inputsp: {PRICE} 
Prolog: empty 
Output spempty 


Inputs: empty 
Generat:empty 
Outputsgempty 


Inputse: empty 


Epi log: empty 
Output spempty 


Loop 2 (inner loop) 


Level: (item-id, store-id) 
Inputsp: empty 
Prolog: empty 
Output spempty 


Inputse: {CURRENTORDER} 
General:calculate EXTENDEDPRICE 
Output sgHEXTENDEDPRICE} | 


Inputse: empty 
Epitog: empty 
Outputspempty 
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If two computations have level compatible loops and if the, ordering: constraints of the two 


loops can be mutually satisfied ina single totally nested loop, aggregation is possible. 


III.1.1 Level Compatibility Between Loops 


It is easy to show that two loops are level compatible: if and. only if their level structures are 


identical or empty levels (levels at which no actions are performed) can be inserted to. make their 


level structures identical. Some examples of tevel compatible totally nested loops (TNL’s) and the . 


level structures of their aggregated results are:!® _ 


loop 
TNL, 


TH, 


level 5 levels in aggregate — 


KO), KL i 
fe (Ky, &,L) 
(KU) | 
KL) 
oe (KL), KL. 
(K,L.m 
K), KU 


(K), (KL), 1% L. 
(K,L), (K,t,% Pewee 


It is interesting to note that when aggregation occurs loop levels are.neither added nor deleted; that 


is, the set of loop levels in the aggregate is simply the union of the sets of loop levels in the 


component computations. 


Some examples of loops whose level structures are incompatible are: 


Loop 


TNL, 
TNL 


levels 


{K) 
tL) 


'S In this section the symbols K, L and M denote different keys. 


me, Ka 
TH, tl), &.0) 


TNL, (K), 0), KLM 
INLo (Ki, &.7, KLM 


HLI2 Order Constraint Corapatibitity Betweesi_ 


TTENBEMAND IS THE Sunt OF DEMAND FOR EACH TTEN-10 

FRACTION 1S DEMAND/ITEMDEMAND IF DEMAND PRESENT © 
It would seem immanently reatematile-to aggrejate these two compeiiations since they have a 
common input (DEMAND) and the output of the first is an input to thé’ setond. Yet they cannot be 


aggregated imo a totally nested foop! Their implementation descriptions reveal why. Recall that 


the description of the first is: 


Loop ! (outer loop) 
Level: (Citem-id) 
’ Inputsp: empty 
Prolog: initialize sum 
Output spespty 


Inputs,: empty 
Gener al: empty 


-« fpeteptempty: 
Epi tog: eapty 


Output set] TERDENAMD) 
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Loop 2 (inner loop) _ 

‘i Level: (item-id, store-id) 
Inputsp: empty 
Prolog: empty 
Outputspempty 


Inputsg: (DEMAND) 
General:caicutate sum 
Outputsgempty 


Inputs_: empty 
Epi log: empty 
Outputsgempty 


The FRACTION computation also has two nested loops: 


Loop | (outer loop) 


Level: {item-id} 
. Inputsp: (DEMAND | 
Prolog: . empty 

_ Outputspempty 


Inputsg: empty 
General: empty 
Outputsgempty 


Inputs: empty 
Epi tog: empty 

Outputsrempty 

Loop 2 (inner loop) 

Level: titem-id, store-id) 

| Inputsp: empty 
Prolog: empty 

Gutputspempty 


Inputs,: {DEMAND 
Generai:do division 


OutputsgsFRACTION) © 
‘I nputsg: espty 

Epi log: ewpty 
Outputsgeapty 


Clearly these computations are level compatible since they have identical level ‘Structures. But the | 
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‘Computations whose totally nested loops are tevet compatible and: satisfy the above order 


constraints are aggregatable.. 


ML 2 Merging Loops 

Because each action and all 1/O must: ‘be performed at the sarne'téver" in the aggregate as it 
. was before aggregation, the Bs structure of the aggregation of two computations can be obtained 
_ through a level- -by- -level merge of the loop levels of the two Sanus to be aggregated. 

The algorithm for merging two. totally nested ddops is: 
For each loop. in one: | 


If the other has no loop at the the same level, and add the representation of that level to the 
description of the aggregate 


If there isa corresponding loop, the two heaps must be merged ino 0 one for the aggregate. zs 
The full detaits of merging loops are complicated, but a ere sketch follows Let the 
corresponding toops be L, and Lz, where no output of L, is an inglut to'L,.'* There are three 


cases: 


1. Some ouput F ofthe Epi Log of L isa input to Ly 
a.F is an Spe to Ly’s nels section: n: aggregation imposible. 


b. F is uised by an ‘action in Li's Censrai section: move that action to the Epi log of 
the the correspending level: in the aggeegate, along witht tty ‘actions in L.'s General 
section. which use, as input, some output produced by the action; all other actions 
remain in the same sections in the aggregated asthepwerein Ly and hy.” a 


c. All other cases::all other actions remain in the same sections ‘in the aggregate as they 
were in L, and Ly. 


'7 Obviously, the case where no output of L, is an input to 1 will be handied exactly tte same, 
mutatis mutandis. Fhe remain case, where each has some output that is an input to the Se: is 


impossible. 
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2. Some output F, generated -by some action: A inthe Gener al section of L,; is an input to Lo. 
a. F is an input to L,’s Prolog section: move A from the Genétat seaion to the Prolog 
section of the aggregate, along with any actions in the General section which have, as 
output, something used as input to that computation; ali other aye remain in the 
same sections in the aggregate as they were in L, and Lo. Seo 


aa _All other, cases: all actions remain-in the'same’ sections itt the aggregate as they were 
in ps and Lo. 


3. Neither T nor 2: ‘alla actions remain in the same sections in the aggregate as s they were in L 7. 
and Lo asunl 


Basically, what this means is that a General <acion thust move ‘to the Prolog of the 
aggregate if it must come before some action in that Prolog or if it must colne'‘before another 
General action which must:-be moved to the Prolog: a General actiole must move to the Epi log if 
it must come after some action in the avis or a : must come after F another General action 


which must be faved to the eI tog. 


113 Non-Tolalty Nested Loops 
. In this report the treatment of data driven loop implementations is restricted to loop 
Structures that are totally nested. roars nested es dailaacrnies are ss any broadly applicable, 
but generally simple and efficient as wel In a they often provide ~~ most efficient and 
si aca implementations, especially when sequently organized fe, sorted my key values, are | 
used. For the sake of completeness, though, something shoulé be sah here abi ton totally-nested 
loops. “Indeed, a great deat couid be said about such implernentntions enough; certainty, to make 
one or more. separate reports. Because of this :the -dituussion here fs necessarily brief and | 
pape! | 

| Most importantly, it.showld be said that non-totally-nemed loop’ structures are by no means 


a 
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inefficient or uninteresting: They are used all the. tite and for good, solid reasons. Their use is 
perhaps most interesting when two or: more computations cannot be performed entirely concurrently 
(ie. in the same loop), but they can be performed:.with partial concurrency. The following two 


examples illustrate. 


‘TL31 Example t: Aggregating Computations with Incompatible Order Constraints 
Recall the flow equations: | ia 
“TTEMOEMAND IS THE SUM OF DEMAND FOR EACH ITEN-I0 


FRACTION 1S DEMAND/ITEMDEMAND IF © DEMAND == PRESENT 
| vANO. LTEMDEMAND: PRESENT 


and their implementing computations. We saw. in Section HI12 that the implementing 
| computations for these flow. equations could not. be merged into a totally nested loop structure 
‘because the inner loop for the first had to be completed before the inner loop of the second could - 
be performed. They can, however, be aggregated into a single loop with a structure like: — 
for each item-id) from DEMAND 
sum = undef ined 
for each (store-id) from DETAND(i tem- id) 
<calculate ‘sUum> 
end 
“if defined{sum) then ITEMDEMAND(item-id) = sum 
for each (store-id) from DEMAND(i tem-idt 
<calculate FRACTION> 
end | = 
end 


__ This is a non-totally nested loop structure, since two loops (the inner ones) appear at the same level. . 
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It és interesting te compare this aggregate implementation with the unaggregated 
implementation-of the two computations involved (as separate loops in separate job steps). On the 
one hand, in sith inipenatatiant evay taasliar the DEMAND flow mest be ‘accessed twice, so no 
accesses are eliminated by aggregation. On the other hand, accesses of the records of the 
TTEMDENAND flow are eliminated by aggregation. if the Se are a ager separately. 
every record of | TENDEHAND must be written into a file by the firs computation and then read back | 
| by the second; whereas in the aggregate implementation the records are used as they are generated, 
so No re-reading is es | | - «3 | 

In general ss! ae ich tac tesla leseeictic "wee eek cea lie tie gy Cale 
which their aggregate cannot be implemented a totaly nested Wop ‘ts’ where, for some loop level, _ 
the output of the Epi.tog section of one is an input te thé'Pro fog seetion aif th other Cat the case 
with 1 TENGEHAND above). in such a case the corresponding loop level of the apgregate can be 
implemented {as above).as two lonps of the same level: performed iti sequence, and re-reads of the 
flow in eens will be saved. | | 
1113.2 Example 2:_ Aggregating Computations hat Ate Not Level Campa ) ibe 

In Section HA! we saw that competatons withthe following evel nracures were not level 


compatible with one another: 


TNL, (K), KL, OL 


Artes al ey ictal AWG cages bars ae egesms Ri tO 


"8 In fact, if these records are not used: by any other computation in the data processing system, 
it is not necessary to write them out into a file either. 


Data Driven. Loops 45 


nesting of loops that will implement their aggregate: . They might, however, be said to be partially 
level-compatible, since the enero levels have identical) keys. If a common driver set can be 
found -for that level, they might..be implemented. as a: norr-totalfy-nested loop structure. The 
following is a possible implementation skeleton: | 
for each (K)' from Dg 
for each (L) from.D, 
for-each (M) from D, 
sia 
' end 


for each (M1) from Dy 
_ for each {(L) from Og 


end 
end 
end 
where the Oj are distinct drivers. 
| This is another commonly found construct in file data 2 procting ‘Te is the « case wheres for a 
common set of values for the sub-index {K), two or more “dependent computations are to be 
' performed. As in the previous example, there is some 10 saving (over separate i aa al 


of the computations involved) because each record of Do has | to be ead only once. 
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IV.1 A Theory of index Sets and Critical Index Sets for Data Driven Loops 


Let us begin with some.definitions and-useful consequences.of these definitions. 


IVA Definitions and Useful Lemmas 
We redefine the notions of : flow's index set and critical index set formally and introduce the 
_ Operators Proj, Inj and Restr: - 
. Definition: The index set of a flow F with index | i defined s 
IS(F) = {1 | there is a record in F for 1} 
Definition: The critical index set of a flow F (with index 1) with respect to a flow X is defined as 


CISy(F) = {I | there is a record in F for 1 
that is necessary to generate some record in:X} 


Definition: The projection of an index set S with index (ky, .... Kms Kgsqs- > Kyl onto the sub- 
index (k;,.., km) is defined as 


Proj(S, (ky,.., kyl) = aig 3 ; 
(Cy, Rd TD Kaeae Kyl stich that hy, ... Kye Kets ++ Ka? € S} 


. Definition: The injection of an index set S with index 4k}, ..,&,) by the index set T with super- 
index (kj... Kgs ae . 1 ky): iS defined as : 

inj{S,T)= , se 
{ (ky... Rags Kote ss Ry) | thy. kg? € SA 

(hy... Kegs Kyete s+ Ryd -€ T} 
Definition: The restriction of an index set S with: Index Ahky,. ., ky) by the condition C (whose 
truth depends on the values of the keys kj, . ., k,) is defined as 

Restr (S,C) = {(k),..,k,) « S $C is true} 

From the last three definitions the following ‘simple but useful results (stated without proof) 

can be obtained: | 7 


Lemma 1: If A is an index set with index I, then 


Proj.) mes eG stad wh ost vebet besin 3 bap ties veges Ja geek TRA 


Lemma 2: ee 


inj (A,B) = 6 a inj A.B) 


agmenst luiszU arte arnkniuniied Levi 
In particular, of A and B are index sets with the same index, then 
atl: soubocie bra Ylemrel ts1 xobet lading bas tse cabal zwotl & io aaadtiont os onttisbes 3W 
Inj{A.8) - Baa 
. = -Yeol bas inl (079 rotetsqo 
Lewema 3: WS and 1 ave index sets where the inten of T is 2 
ze bealish ai i xebed dpe 7+ 


16 to sr esha 9¢ Be nitoC! 

inj Sc 

fi wel 9 ab bwost 4 i overt | Hes me od 

Lemma 4 HT ts am index set with index Iy and S ts ae inten et wth inex otf Sarit 

_ts benttel bX eof 6 of bsqen duw A xobni dtiw) P well « to pe evbel Imi > sdtiniisG 

‘of 1y, then 
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Proj inj (S50, hgooa: Sreawapenggs T2084 ci tee 
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Corollary 1: Let F be defined as in Theorem 1. Then for any flew F, with index identical to 
that of F ; at 
Cl6r(F) = SKF) 
Theorem 2: If Ris a flow (with index Ig) described. by the application of a reduction operator 
to a flow expression expr in terms of the flows F;,.., Fa where a flow F; has index I, (eg. 
the flow equation for. Ris: RIS SUM OF expr FOR EACH. <ig>) then — 
CISptF) = Proj {IStexpr?, i) 
(Note that the index of expr must be a super-index of in.) 
This theorem simply says that. when a. flow (as tha described: ‘by expr) is reduced: every record of 
that flow is used in calculating the result. From Theorem { we have in turn that the critical index 
set of each F, with respect to the flow.te be reduced is given by the expression on the right-hand 
side of the above equation. | | | 
Corollary 2: If Ris a. flow (with index 1,) described. by. the application of a reduction operator 
to a flow F (eg. R IS SUN OF F FOR EACH <g>), then 


CISg(Fl = IS(F) - 


The following theorems concern the-nature of the fades sets of flow. axptiésions First, a 
simple result about flows described by radiiclions: 7 
Theorem 3: If R is a flow (with index Tp). described by the application of .a reduction operator. 
to a flow seinen sie tee the flow equation for R is: RIS SUM OF expr FOR EACH <g>), 
then | 7 


IS(R) = Proj(Sexpr). ig) 
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Data Driver: Loops 


IStFyIfni« 1 
IS(safelF,,..,Fal) = 
a IS) fins be 

i 


“$1 


As mentioned above the only legal arithmetic flow expression in FE-HIBOL is a safe,or a 


safe further qualified by some condition. This further qualification must take the form of a logical 


expression ANDed with the safe. Thus, to complete our treatment of arithmetic flow expression we 


only need the following simple theorem: 


Theorem 5: The index set of a simple arithmetic flow expression safe qualified by the 


condition C is given by 


IS(safe AND C) = Reste sts teatel: Ce 


ei 


- Consideration of special cases leads to three deci corofiaries: 


_ Corollary 4: By Lemmas 2 and 5 


1S (safe AND G PRESENT) | = Inj(G, 1Steate)) 
= 1S(safe) N Inj(G, 1S (sate)) 


Corollary 5: 
IS(safe AND (C, ANDO C,)) = Restr (IS{safe),C,) fi Restr ([S (safe) ,C2) 
Corollary 6: . 


IS{safe AND (C, OR C,)) «= Restr (IS(safe) ,C,) U Restr (1S (safe) ,Co) 


abet 


For conditional expressions with two cases'® we have the following resutt: 
Theorem 6: Let E be a conditional flow expression of two terms. 


Ess expr, IF Cy 
ELSE expra IF Co 


'9 The extension of this theorem to more than two cases is trivial. 
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In level | we have the output R and the driver F. The index: set D, enumerated by this driver at 
this level is”° 
D, = Proj(IS(F), (k,)) = 1SIR) (by Theorem 3) 
thus satisfying the driving constraint for the input R. 7 
In level 2 we have the input F and the driver F. The index set D2 enumerated by this driver 
at this level is | 
Do = IS(F) = CISR(F) (by Corollary 2) 


thus satisfying the driving constraint for the output F. 


Example 2: 


de i 


PAY IS HOURS « 3.68 IF HOURS PRESENT AND 
: a NOT HOURS > 48 
ELSE 128 + (HOURS - 4) * 4.5 IF HOURS PRESENT 
We shall use this example to illustrate Theorem 6. Define E, and Ey by 
E, = HOURS * 3.68 IF HOURS PRESENT AND NOT HOURS > 48 
and 
E> = 126 + (HOURS - 48) « 4.5. IF HOURS PRESENT AND 
| NOT (HOURS PRESENT 
ANDO NOT HOURS > 48) 
By pure logical simplification the last equation can be rewritten: 


E> = 128 + (HOURS - 48) ® 4.5 IF HOURS PRESENT AND 
rs HOURS .>, 48 ::: oon 


From Theorem 6 we have that 


ad Theorem 8 of the next section. provides a formal treatment of enumerated index sets. 
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for each Citem-id) from C 
get P(item-id) 
for each (store-id) from Ciltem-jd) 
get Clitem-id,: store-id) : 
EP(item-id, store-id) =... 


if defined{EP{item-id, store-id¥) 
then urite EP(item-id, store-id) 


end 
end . 
In level | the input is P and the driver is c The index set D, enumerated by this driver at, 
this level is | | | 


D, = Proj(IS(C), litem-id)) | 
> ISUP) WProflFS(C), (itemtial) = C1SepfP) 


In fevel 2 the input is C, the output is EP and the driver is C. The index set De enumerated 
by this driver at this level is Cee ete" 
Do = IS{C) . 
> InjUStP), 1S(C)) (by Lemma 3) 
= elses = eae 


Thus we see that the flow C is (at least). adequate to drive both levels. 


IV.14 Driving Flow Set Sufficiency - 

‘We wish tobe able to determine “whether ‘a a set of input flows is sufficient t to drive a 
computation loop level. Let us begin by defining the notion of the necessary index ‘set fo for a 
computation level: 

Definition: The | necessary index set at level j for a computation Cc (dengted misicn is defined as. 


the set of index. values necessary to drive level i of the anes 09 pleating C. 


ee 


By the fundamental driving constraint we have 3 out ibi-esti} ese 10} 


Theorem 7: The necessary index set for level-i of a. MstilS ise 


wisec) = ( U cise) fof" restr ree 16? 
F € et}. 6<f 
ic BCC | ~~ tbi-ewte .bi-msti 17 tse 


v2. = (bi-stete ,bi-esf i} 93 
where@(C)- outputs of computation C 


we) = cunputs of level i -g7nte ,bi-maf ii Ga}usniteb Yi 
: ach inputs of tevet in! ed bi-nsf i) atin fed 


Now top fre cam be iven by tps ety ae me ome eels teat higher 
levels do not have enough keys in their indices: (Oterenaly the ines ut eommoreted by 2 driving 


sd? i fevel al : 
ipa We to ks pny a a ee 2 Bid a tng me ata 


a} faved 2ty ay 


lower level is given by the following theres: 

| Theorem is ; inimed id, {Rehjier = J 
ada set Sr Cle pemepar eet a APP Sesh ae daten Lop encinpet F reed 

nefenderes rome TT Soba aAT ai yevab sd} bos Wa suagi out aa juga odt § frvof nl 


ae = ProjliS#).1) a} devel adi fa ravints aid yet 


Chel = 30 


Using the terminology just introduced we have: 0.1 yd) fiqrel, oe - 
ae in = 49) 
‘Theorem %& A set 8 of flows is sufficient to drive level i (uth 1) Wand andy if 


__ absval digd avinb of stsapsbe (esd 3a) et 9 oo oft jar soz ow cd T 
MIST) «c 8B Ss, Fe} 


F cbt 

a | ad fomibtiue me wor gavin tive 
| that is, if and only set at Sorel. 5 Sutates the perenne tgdgn oct 
‘@ svitb 0} poor 2 “een loge Torte gepsciieic as ic aiea laa i 


for that level. 
‘g tol tea xsbai yirsss8a sa. 5 nation oi! grinieh yd niged wy fet Jsvsi qool noiiasgnio? 


daved aciinhayreo> 


ee =pr 


- IS"), age Vaiuail ‘a aoe Gad de 
- pat nor ptt FOE igen 


Data Driven Loops 57 


IV.L5 Minimal Driving Flow Sets 

| ‘The set of all inputs of a computation. is sufficient.to drive that: computation. We are 
interested in finding the smallest subsets of this set that.will provide. sufficient drivers for each 
level. This interest stems from our implementation constraint. that all drivers must be read 
sequentially and must have compatible sort orders. If all-contained: inputs were used to drive'esch 
level of a computation loop, all inputs to that computation would -have to ‘have compatible sort 
orders and all would have to be read sequentially, a.constraint that is.often unnecessarily severe. 

Moreover, from an efficiency point of view, we generally-want the set of indices enumerated 
by the drivers at any level to be as small as possible ae pieine the fundamental driving 
constraints) so as to minimize the number of iterations. For exampk, un we are ying to minimize 
I/O accesses and we have a loop that reads some : (non-iving) iow by random access, sins fewer 
iterations there are the fewer attempts there will be to access records from = flow. 

Consider, for example, the EP computation (Example 3 above The inputs contained in the 
outer loop are P and C. Both together could have been used as a driving flow set for that level. 
We were able to show, however, that C alone was ‘sufficient | to drive the outer Sooke Ths we came 
up with an iriptementatice in which only the flow c had to be be sone and read sequently : 
Additionally, in this implementation only those records of P that can tually be used are fetched. 

‘It is important to note that the using some smallest driving flow set for each level does not . 
always improve efficiency. In the computation above ft can be shown that P alone is suthiceent to 
_ drive the outer oop. However, such an implementation would be | no better than one in which the . 
" outer loop is driven by both inputs. Since the. inner. loop must tbe. driven by C c in any case, we . 
would still end up using both inputs as drivers ‘both would have to be sorted Sabi tas and read 


sequentially; and more records of P would be read thoes would actually be ned, 
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A>B Beer > Aeher 
The expression on the right of the equivalence. symbal 4): is. a formula in the ‘first order 
predicate calculus. If this formula can be shown to be-a-tautolegy the corresponding set inclusion is 
proved. Showing that a formula is s sunlaay sl Mesuine to: showing that it simplifies to T. 
Since powerful first one predicate calculus simplifiers exist, the: task of proving sét inclusion. can 
be solved by recasting the hypothesis as a predicate calculus formula aor eyms to sipiity it. If it 
can be simplified to T inclusion is proved; is it noatie to F inclusion is disproved 
When the formula cannot be ‘simplified to elther 7 or F, the meaning of the result is not 
clear. Either the simplification is correct (in which case caren is nota tautology, and thus set 
inclusion does not hold) or the simplifier has run up against. a fundamental limitation? and has 
failed to simplty the formula completely: In. the latter case the forme ay in: fact be equivalent to 
MA (implying set inctusion), but ‘the eas is unable t to determine it. Because of this sanity, 
the wisest assumption is the conservative one: sehen simplikation » ‘te 1. does not occur, set 
_ inclusion does not hold. | 
IV .2.1 Characteristic Functions for Index Sets _ 
_ In this section the particulars of the syntax2® and semantics of re raraceriate i ‘functions for 
index sets are presented. . fic. baa fads int TG 
The characteristic function for an index set is a loghca}-expresion (predicate) in tevms of its 


_ the keys of its index that is true-for an assignment of vahues. to those-keys-in: exactly those cases in 


24 Tt is a well-known fact that it is impossible to devise a procedure that will correctly simplify: 
avery. formula in the first order predicate calculus. 

Because: our work is implemented in the LISP — meee sina notation is - 

unabashedly LISPish. 


Bo 


which the index set contains a corresponding index value. That #, Sa, th), ... k,) denotes the 


characferistic function for the index 'set Sthen 


See fli. bg) © Tif Sosntains an index vatee with k= By... ..ke = by 


L Stamdasd logical aperaeers*® 


. AND UBD, ...p) + T r= ire ht Ha afte 
ecard wae 


b. OR (OR Pie. pd aT foe + pater he ple ance any of the are 


«. WBE GOT p) «7 for a periatr ey-tple instance’ ip & fate for that 
instance 


d. FOR-SOME {FOR-SOME tk, . Ky) Blk. Bye Ree Ry) = To fora 
_partioular Lep-teple-eastance tk, ;; -: ee 
een iat the predicate es. ee ee 


2. Standard arithmetic comparison operators {their arguments mast be arithmetic expressions 

in terms of. a a a +, -» 

wand /) — ie ; 
a. EQUAL (EQUAL expr, expr) - Tif expr, and expr have the same mumerical 
value sg 


b. GREATERP (GREATERP ery joe: <1 AF the manera vale of er is 
ogreater than that of gr, 


3. The special operator DEFINED; (DEFINED (V per ,,.., ky)) = T iff there is a record 
in the variable Y tn period per: for the key-supte imatamce ik,,. , ho). The agua 9 2 


. DEPINED operater. must bea variable. 


The seraas intreduoed here are expiained in greater detail in the following sections. 


6 The symbols p and p, denote predicates. 
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IV.2.1.1 Variables . 
A variable is a representation of a HIBOL flow, with key and period information attached. 
The period uniquely identifies the variable in time (ie. it specifies a particular “incarnation” ef:the . - 
flow). An assignment. of values to a variable’s index and its. period, specifies. an instanceof that 
Variable and this instance is said to be defined if there is a datum (and, thus record). corresponding 
to the key and period values named in the assignment. , 
The general form for a variable is 
( f lou-name period key, ... key.) 
where flou-name is the name of the associated flow?”, the stot ise tod. comeainaiioe name. of the 
period in which the variable is pererated or input, and the slots Kou contain the names of the keys 
of the variable. An example of a variable speciation is | | 
(ENROLLED term student no ject -rumber) 
ee 
ENROLLED is the name of the variable 
term is the name of a period 
student and sub ject-number are the names of the vartable's keys 
An occurrence of ‘a variable. in 2 predicate is ¢alled. a. variable reference. In-a ‘variable 
reference the form in the period slot identifies a parc incarnation of the variable ¢ if the 


period slot contains TERM that means that this term's incarnation of the variable k is. being referred 


to; if it contains (PLUS TERM -1.), last term’s incarnation is referred to). 


2 ¥eey4 


27 The variable and the flow have the same name. 
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1V.2.1.2 (DEFINED var iable-referencel 
"3 This expression’ is true if and only if variable-reference is defined. In particular an 
expression like | | 
«DEFINED (ENROELED term: student subject-number)) | 

is true for an assignment of constant values to each of its keys and its period if and only if the 

variable ENROLLED in the specified period contaifis'a record corresponding to the specified index 
value: otherwise it is false. Thus, for example, the predicate above is true for sub jec t-number = 
33 and term = TERM if and onty if in this term's incarnation of ENAOLLED there is a record for the 


index value (JOE 33) (ie. and only if-Joe enon in subject 633 during the current term). 


IV.2.13 Correspondence Between Logical and Set Ther Notation 


In our characteristic functionfindex s set = aay: = general correspondence, between logical | 


and set operators is given by: 


logical operator 

OR - A Ls 

(FOR-SOME (ky... %y) Sag? ProjtS, tk): kD)” 

(AND Sug Ch “ ote Cc) 

AND So; Voie? 7 “inj. 

(DEFINED (V ...)} ISI) \. 


That is: 


the characteristic function of the intersection of two sets is the — AND of their 
characteristic functions; 


the characteristic function of the union of two sets is the logical OR of their characteristic 
functions; 


the characteristic function of the projection Proj (S, 1°) of an index set S onto the sub-index 


I* is the FOR-SOME operator applied to the characteristic function of S and. the remaining 
keys; 
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the characteristic function of the restriction Reste (5,C) of an index: set S by the condition Cc 
is the logical AND of the characteristic function of S and spe craven C; 


the characteristic function of the injection rT n it, 1) of an Index set S sby the index set T is 
the logical ANO_of their characteristic functions; © 


_ the characteristic function of the index. set IS(¥):0f a: variable ¥ is the DEFINED operator 
" applied to that variable. 


This sipping can | be. used to- detenniine the characteristic function of eae set t expression 
encountered above. | | 
neaees 
The index set 
ISP) 
| # has the characteristic function 


(DEFINED {P DAY i tem-id)) 


The index set 


IS(P) mn Proj(iS(C), titem-id)) 
|. has the characteristic function 


(AND (DEFINED (P DAY item-id)) 
(FOR-SOME (store-id) (OEFINED (C DAY item-id store-id)))) 


Th index set ; ; ee Ba 
_Restr (1S (HOURS), NOT HOURS > 48) 
_ has the characteristic function 


(AND (DEFINED {HOURS HEEK employee-id)) ? 
(NOT (GREATERP (HOURS HEEK employee-id) 48))) 


We would like our  Garsceiciic ce te contain as much information as possible 
"$0 as Lehi mie eee ee sets. 

The only possible characteristic function for'a variable: (V per ky, - Ry) that is'a system 
input (i.e. a ee ee ee ee 
trivial one (DEFINED (V per by k,)}, because al that can be said is that i¢ contains record 
iff it contains a record. : 

In some cases an spi variable may have the special property that it will always contain a 
record for every allowable index value. (Knowledge of such a property cannot be deduced from 
the HIBOL specification of a data processing system, it must be supplied separately.) Such a 
variable ts termed dense or full. An example might be sd ve Banas which in every 
incarnation should have a record for every possible value of the index (item-id). In such a case 
the characteristic function of such a variable is simply re 

We could use the trivial characteristic function for a computed Mavestaes as well, but more. 
(useful) information can be obtained through the application of Theorems 36 to the defining 
HIBOL flow equation. Likewise, we cam use Theorems Hand 2 to obtain useful characteristic 
functions for critical index sets. Characteristic functions thus obtained are called bné'siep 
characteristic functions. 

It should be easy to see that for any characteristic function if afi Gccurrénce of (DEFINED 
variable) is replaced by the characteristic: Fanceta ! eek var fate, ‘the rest wit be a logically 
equivalent characteristic function. This is | ashi beck-substitution of characteristic functions. if 


back-substitution is applied recursively, the resuk will be a characteristic function containing only 
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DEFINED's whose arguments are non-computed variables. This is called total back-substitution. 
Total back-substitution of alt characteristic functions has the advantage of making them:all into a 


uniform form, thus facilitating comparison and logical manipulation. 


IV.2.3 Example 
Consider the flow equations: 
S 1S H«xR_ IF H PRESENT AND R PRESENT 


Xx IS (H - 48) x« R / 2 IF H PRESENT AND R PRESENT AND H > 48 


P IS S +X 1F S PRESENT AND X PRESENT 
ELSE S IF S PRESENT 
ELSE X IF X PRESENT 


é 


where the flows H and R are system inputs, all flow have the index (key) and afl computations are 
performed daily. The one-step characteristic functions of the necessary input sets are:?8 


NIS{(S) ny = (AND (DEFINED (H DAY key)) 
(DEFINED (R DAY key))) 


NIS(X) cer = (AND (DEFINED (H DAY key?) . 
(DEFINED {R DAY key) —~ 
(GREATERP (H DAY key) 48)) 


NIS(P) ongy = (ORIDEFINED (S DAY key))  - 
{DEFINED (X DAY key))) 


From these we deduce (by Theorem 9) the following. resutts 


1. Computation S can be driven by either H or R,-since both 


28 We use the outputs as the computation names and drop the level subscript since there is only 
one level. 
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NIS(S) a. 2 (DEFINED @1 DAY key?) - (la) 
and | 

NIS(S) yg, 7  (OEFINED (R BAY key)) (Lb). 
‘clive | 


2. Computation X can be driven by either H or R, since both 


NISC) chy % (DEFINED (H DAY key)? = (2a) 
7 - eee cae | 

NISOO cee > (DEFINED a DAY key) ) . (2b) } 
are true | | | | / | 


3. Computation P must be driven by both S and X, since neither 
NIS(P) ng % (DEFINED {S DAY key)) @a) 


nor 


+ 


NIS(P) 4, > (DEFINED (X DAY key?) (Gb) 
| _ are true, but 


NIS(P) 


+ 


“(OR (DEFINED (S. DAY Wey}? = Od 
(DEFINED 1%. DAY ey)? 


However, we know that 


IS(S} 44, = (ANDIDEFINED (H DAY key)) 
(DEFINED (FR DAY ReyBPP 


ISO) cg = (ANDIDEFINED @8GAY Key). 
(DEFINED (R DAY key)? 
(GREATERP {H DAY key) 48)) 


’ so back-substitution of characteristic functions yields 
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NIS(P) che, = (OR (DEFINED (S DAY. key?) 
(DEFINED (X DAY key))) 


"= (OR (AND (DEFINED (H DAY key)? 
_ . (DEFINED AR DAY keyhd) : 
(AND (DEFINED (H DAY key)? 


-.. SDEFINED.AR DAY key). 
(GREATERP {H DAY key) 48))) 


= (AND (DEFINED (H DAY key)) 
(DEFINED {R.DAY keyh)} 
Thus, formula (3.2) 


NIS(P) cn 2% (DEFINED (S DAY key)? - 
becomes 
{AND (DEFINED. (H BAY key)) (DEFINED. (A: BAY: keyh)) 


> 
(AND (DEFINED (H DAY key)) (QEFINED (R DAY weyt)) 
which is obviously true. Thus, back-substitution has revealed that computation P can be driven by 


S alone. 


1 aad 


| Rican 2 4apP IM 


mp, paged in on: 
- rozh vad #8 ee 


iste 
sod fiev0iv de ay tipi 
: ¢ "arte ed was 9 nuristuqates ied! baleover sei pesthesitedie- “tone matt 
sl SEPe 


tocing a te phenom of he met en siehe tg: ten ig 


V1 Simple Computation 
Comuider the HEROL, few aqration: 
PAY IS HRS © 2.98 oo 


‘bath fe ret be rede i 
Tee ta impish comps PLA cect aaendanl 
| ead 2 record: af the MERNG fle 


29 We make a distinction between 
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extract the data item (the number of hours worked) - 
multiply it by 3.00, . | 
assemble the corresponding record of, PAY. | 

whose emp! oyee-id key is the same as the record read 


whose data — value is the result of muakiptying the value of the data item of 
the record read by 3.00 is . ban att Rd nie qe BO ORS 


‘ 


write the newly created record to the file PAY 

To support this iteration, there must be 
declarations of the data objects to be used” 
toop initiatization 


_ EOF (end-of-file) checking {to terminate the loop) 


VELA Necessary Data Objects and Their Declaration 


First there must be dectarations for alf input and output Tiles. Assume that the files PAY and 
HOURS are known by these names to the PL/I environment (jer code can 1 be geneiated to make 
_ this happen). Then the following declarations must appear in the PL/I code: 


DECLARE HOURS INPUT FILE SEQUENTIAL REC 
PAY OUTPUT ‘FILE SEQUENTIAL ' 


; 

There must also be dectarations for data structures ancary to the 0 and contro! to be 
performed. In particular, for every input file there must be a record image data structure into 
which a record of that input can be read. Likewise, for every output file there must be a record 
_ image data structure into which a record of that output can be buik so that i can be serkien out 
In our simple example, the HOURS and PAY files must have such associated data objects. The PL/I 


structure can be used for this purpose: 
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DECLARE 1 PAY_RECORD, = 
2 EMPLOYEE FIXED DECIMAL (4), 
2 PAY FIXED DECIMAL (4}, 
1 HOURS_RECORD, 
2 EMPLOYEE FINED DECITAL te, 
2 HOURS FDED DECINAL (3); 


Final for each | input a Mag i is needed to indicate the. mas pandiien fr shat sap. Thus, for the 


OW soy Bray: 


HOURS file we would have the declaration: 


DECLARE 1 EOF ALIGNED, ee re | 
2 HOURS BIT (1) UNALIGNED INITIAL (° ‘wars 


When EOF occurs on the associated file this. flag is. set to “1B 


V4 Loop Initialization 

Before iteration alt fags must’ be initialized. This can be done by the use of the INITIAL 
statement in the declaration (as above for EOF HOURS). Abo all drivers must.be read to. establish 
initial values for their indices. in our example, the initialization sectiog mould consist of sacrely: . 
READ FILE (HOURS). INTO. (HOURS RECORD); | 


To detect an EOF condition on a file and st its. cosresponding flag. the PL/I ON construct 
can be used: ‘For the HOURS file the appropriate code would £ be... 


“ON ENOFILE ee! — HOURS = 1B; 


‘To enforce iteration termination ‘Upon EOF of the driver, the logp is constructed. using the 


: form MHILE. S Fond driver). 
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V.11.4 The Loop Itself 
Given this supporting structure, the rest of the sneplementation is easy. T-he foop itself can be 


written simply as: we YER A ees 
DO WHILE (~ EOF. HOURS? : | eee res 
“PAY _RECORD.PAY = HOURS _RECORD.HOURS # 3: - tans 


PAY_RECORD.EMPLOYEE = Se ee 


Bert 


WRITE FILE (PAY) FROM (PAY RECORD) 


READ FILE {HOURS) m0 (HOURS_RECORA) 
ENO ; 


When the loop terminates, the job step is ended ‘nd his input aif dapat files are automaticaly 


Ts: Yas os ao5 OEP tea SP 


closed. The complete PL/I program for the pay cafculation cepemien is given in F Fig L : 


{ORES St 
V1.2 Uniform-Index Matching Computations . 2 6. © 298 (aie) fA *, 
Let us extend our treatment of single-level lgapoieplementations 26 these with’ mare tfian one 


input. We use as our vehicle the variation of the. per calsuaen that Inc haies & rate file Uindexed 


by employee-id}: Qc oR pope gs civ Lowe Aid} Te 
PAY IS RATE * HOURS IF RATE PRESENT AND HOURS PRESENT : 
a he 
Stippase that the input files RATE and HOURS are to be read sequentially, that their records are 


sak: FER Lb avgced - 


sorted ba emp |oyee-id and that HOURS is tised a the | hoop driver. 

_ Again because the loop is driven by a single input file, it is implemented using the form DO 
WHILE (- EOF.driver}. However, the computation description dictates that a record of the 
output file PAY for a given value of the key emp! oyee-id is to be produced if and only if there is a 
record for that employee in HOURS and there is a ‘sespesaing record in the RATE file. Therefore, 
in the body of the loop, before the output record can be calculated, the record (if any) of the non- 


driving input that matches the current value of the driver's index must be found. | 
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To find the matching record of the non-driving imput we read: successive records from its file 
comparing the index value of each record with the current. top. index. . ‘Phe: general. matching 


algorithm consists of the following loop: 


For each non-driving imput: — 


1. If FOUND. input is. true Aindicating. that the record. currently. held. in the. input's image 
structure has been used) read the next record of the mee 


_ 20 iffan EOF condition has occurred on n the pet set ‘FOUND. input to 0 false 2 0) and exit the 


3. Otherwise, check the index, of, the fusrent input-record against the index of ‘the current: 
"driver record: 


ie na de rg ety ce ee 
“ars, set FOUND. Input to true and exit 


agit 


"If <, read the next reord of the input rand § ge te Hep 2 


geo 


“Pt E> 


If >, there is no po cuctesponding. record: in the input. Set FOUND. input to 0 false cba case 


_ the index of the. record, just.read may: match that-of. seme! stbetquent: driver: record) *: 
and exit 


ee ot 1 “4 nie ts 3 5: she, Me eg sre dere 
To support this algorithm a flag FOUND. input must be declared; fos:each now-driving — 
and initialized to.true ) hpfore:tha-srain feop. ene 38 aden oth FOSS ysereaty 
The implementation of the rest of the main loop’s saa (ollowing the matching code) 


conan of code that attempt to nee the “re record using only Jeon! non-droing inputs 


whose FOUND fags are true. “Basically, in this aii the PRESENT checks of the HIBOL ee 


ne 


become checks on the corresponding FOUND flags. 


This matching process must be implemented for every non-driving input in a data driven 
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PAY_COMP: PROCEDURE ; 
(dectarations) 


ON ENDFILE (RATE) EOF RATE = 11'S; 
ON ENDFILE (HOURS) EOF (HOURS = '1'8; 
READ FILE (RATE) INTO (RATE_RECORD); = 
LEVEL_1_MIWIMUM. EMPLOYEE = RATE_RECORD.EMPLOYEE; =. 


DO WHILE ( EOF RATE); ; 
JF EOF .HOURS qrenpit aR 
THEN DO;/* THIS READS ITEMS, SEQUEMTIALLY, FROM A FILE, UNTIL. THE. REQUESTED. . 
RECORD 1S FOUND {SET FLAGS TO TRUE) OR PASSED (SET. rach! faLSe). 
IF FOUND.HOURS RECORD ‘ ree sine : 


Be 2a 


THEN READ FILE (HOURS) INTO (HoURS eco) 


HOURS_RECORD_COMPARE : 
"HE -EOF OURS 
THEN $DUND .WOURS_RECORD = '8'B; 
‘ELSE IF HOURS SECOELEMLONE 2 = LEVEL Assim amos 


THER FOUND. RQURS saint . ieee * - os ase cin 


wen “FOUND. wekidg s. Bo 
i _ ELSE 00: a a pall NTO (wouRs mecons): 


SRY 
o 


END; 


| EMPLONEE . PupveL 1 Wintel. QRPLOYEE; | 
waite Fite, (Pay) | FROM. (PAY_RECORD)... 
END; 


a 


READ FILE (RATE) INTO (RATE_RECORD); 
; LEVEL_1 MINIMUM.EMPLOYEE = RATE_RECORD. EMPLOYEE; 
END; 
END PAY_COMP; 


Figure 2: PL/I code for PAY 1S RATE * HOURS 
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First, notice that the iteration structure is fundamentally different from: that for a single 
driver loop. The index value determination and. BOR. checking és. new performed: at the beginning 
of the loop body.?' As always, the Heration. is terminated when all Gtivers are exhausted ‘(When the 
flag EOF_SO_FAR ends.up true efter-al:drivers-have been réad). Thus the feop exit must appear 
before the output cakulations.agd the form. OO:MHILE: 4° 1'B} is wed: instead ‘oF DG WHILE (- 
EOF. driver) (as in the sinalesivessned This is. past a- minor variation on the basic stheme. © « 


What is interesting in the implementation of Fig. < is tee use of me PL/I ACTIVE structure 


ry Soi Re ei eR Ve 
and the ACTIVE_DRIVER COUNT variable in determining the proper next i index value. The idea is 
fay Mare qe M¢ 204 D oar 
to took through the drivers in succession. The first is used to establish a tentative index value for 
“Py eaieis g ie yf YEE ips BPR ER PGR _carafgipeed 
the current iteration. The first driver is also sible a number that marks it active (for the time 
bl le hers EDP eed a na EF oe et 


being). it the next xt driver has the same neon wae it is given the same ‘janet. indiciiting that it 


will, be active, when, the, first is;-4: it. eae @ lower -taiex walue:thetoop: ifndéx teyeset andthe second: 
: ‘ew ygnre fF nee 2? sta ast we Peg Dr cma) Po AE 
driver is assigned a ae number, rnesning: that it is tentatively active sand: eee: that the 


Vebkf ees OAs Petes} 
. Tk Se Pa MP WS. iM Ped Se 8 i we 


first is inactive). When al drivers have been examined, those sharinigrabe frigheat ACTIVE number 


(held in ACTIVE DRIVER..COUNJ). nee. =e Soe 


WIA NTN 2 3 pete aiesu ory 2s 


V.2 MultipleLea i Logps, |. nae Lo beies Sra ono gad ak Bars Aidtas, Oe li e ee ue is 


Muttple-level ead introduce the need .for maintenance of. current index values foreach 
" distinct loop level and- for, antral structures. to iayplement Joap detving: fron toops'at lawer-Fevets. 
Muktple-level Ipops arise from two basic sources: reduction; computetions (and mixed-index: 


matching computations. , Let us examine the i Of each in:tuen: © 


31 It could be done at the end of the body if the same code were duplicated as an initialization 
before the loop were entered. We have refrained from doing this to minimize code. | 


TTEMDEMAND COMP: PROCEDURE ; 
(declarations) 


ON ENDFILE (DEMAND) EOF OEMAMD. 2 1B, 
READ FILE (SEMAND) INTO (DEMAND RECORD); 
THEN ‘DO; Lever 2 “iNT. TEN = DEMAND RECORD. Itt; 
LEVELS, 1 TAR 2 MENTING. STEW LEVERS UU TTEM, 
ne . 
ELSE LEVEL_1 = ‘O'S; 


DO WHILE (LEVER Ads te pe eG Pile aS Spee BANE 
SEFINED.1TEMDEMAND = ‘O'R; 
: Sah ae PONT, Satie s Ge ae EO 
"pO WHILE (LEVEL_2); 
if DEF INED, JIEROEMAND |. Gk a ee, ea) apne Ae 
THEN TTEMDEMAND RECORD -1TEMDEMAND = ITEMDEMAND_RECORD.ITEMDEMARD + + DEMAND_RECORD. DEMAND; 


ELSE 00; ITEMDEMAMP. RECORD.ATEMOEMAND © GRPINRGRECORG DERAND |: eRe 
DEFINED. 1TEODEMAND = *1'S; 
, EMD; 
meee sorte 
READ FILE (DEMAND) INTO (BEMAND_RECORD); 
yee 


If €0F .DEMAND ak 
THEN DO; LEVEL_2_MININUM. ITEM = DENAND RECORD. ITE; 
TRAVEL 22 NUN. TEND: LEVEL ane ee es” 
cogp, | THEN LEM a cn@r8; 
. . END; 
ELSE DO: LEVER 2.= °O°M. sls ei aint PAPO! 
LEVEL_1 = ‘O'S; 


£ND; , ; : 
: LTEMDEMAND Recon. Ten s LEVEL I “PINION. tem: 
WRITE FILE (TTEMDEMAND) FROM ( 1 TEMDEMAND_RECORD); Pee GRRE R TS ee 


IF EOR.DEMAND coed eet ae 
THEN LEVELS 1. “THRU 2 MTR IMUM. TEN. « LEVEL_2 Te. ITEM; 
ENO; 
END ITEMDEMAND_COMP, . 
: £3 x4 oe Pate hh Hae 3 EE es 


Figure 4: PL/I code for [ TEMDEMAND Is THE seule & DerAND FoR EACH cid based 


The ender sheoh ae te ty mnt ee Wi A 


LL and LEVEL _Z ave ened wg cont et en tn oe aatiawelinn 2 


O08 wen) oes cavanaa) 2p s e328 
: + mag. 303 7 
$2293) :00 N3NT 


cae he eed ve repo | (080208. anna} me erat) zany atime 


"arg. ain Jans OO F 


agg ARONA GH? 
s 3 {initiatize) Read a record of the CUNRENIGNEER Qe. | as a 
- Sy PTE 4OA9 ROW GuaAnae AO Nu2 BT 2 DMMISHAT i set oboe HUT) ccugrt 

t Rend recurds from the PRICE file wal ether: . 
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_a. one is found that has an item-id value matching the driver's item-id value, in, | which 
case all EXTENDEDPRICE records for that sere can n he  daesdegtied or 


at 


b. one is found that has an item-id value peer ‘than the ve’ the PRICE file is 
exhausted, in which: case there t is no pasta I vale’ aind the inner loop can be skipped. 


2. (Inner loop) Generate all output records for the’ piven ttéa-id value, reading records from the 
driver as you go. When a driver record is read that has an biti: vane greater than that of the 
current PRICE record, or the driving file is exhausted; exit: * pes 


3. If neither input file is exhausted go to step 1 and repeat; otherwise exit. 


“in this way each record of the PRICE file is read wien 


A PL/I Implementation of this peo is’ Shows Ww Fig: ¥ The} reader will notice that this 


ba 


imeriememation: is unnecessarily i inefficient because whe . matching PRICE record is not found the 
23 “at Bef imp’ 

inner loop is ees anyway. This is tome se west happens ‘in the general case where 

there may be calculations in the inner: Wop de se proms without the use of a missing 


input. 


V.3 Aggregated Computations 
The aggregation of two or more computations into one nested loop introduces a consideration 
~ Not seen before: the synchronization. of contputations at ‘different’ tip Tevels. Consider the two 
: Sek Ge PONE OA aR a . 
Bees WER 


HIBOL computations: Seas sa 


EXTENDEDPRICE 1S PRICE * CURRENTOROER IF PRICE PRESENT 
| AND CURRENTORDER PRESENT 


VALUESHIPPED IS PRICE * ITEMDEMAND IF PRICE PRESENT® 
: AND ITENDENANG' PRESENT =” 


42 If CURRENTORDER had been unsorted or r sorted differently, records from PRICE would: 
generally be read. more than.once. . 


where CURRENTORDER is the same as. above (with: index (i tem-jd, stere-id)) and 1 TEMOENMAND ts 
a file with index (item-id). As.me have seennaboue, the first computation can be implemented as 
a tworlevel nested. oap, The second. computation erates over: the: single key item: id and ‘so has” 


_ When aggregated:the result és a.twotevelitoop: =. 


“Levels. itemcid) 
Protog: ehlealate Seiad discuss 
Outputspempty 


> anputeg emt, sc 
ep ites: empty 


Loop 2 {inner loop) . ; 
_heve}s, ff tennis, gtone-td).... nj gee dest fee epee Beet cde tno 
Inputsp: ICURRENTOROER) | ° ; 
. ee logs.. ... -cateudate wostendtedrprice coc 8 
Dutt spAERTENEDPRICED 


hy as > at ia 
Inputers ompty 
Outputecemty 
What ‘is significant ‘here is that the conus in tie aggregate | occur ‘in diferent eta, 
5 VAs aa dt add geht eh OW oa 


Suppose that the PRICE file is guaranteed to have a record for every iten-id. Then 1 TENDEMAND 
is the natural choice for a driver for the value-shipped computation because a record of the output . 
will be generated if and. only. there. isa. tecord in LTEMQENWAD: forthe. tame ‘hey. As ‘for the 
extended -price computation, CURREN TORDER. isthe only possible-choice for. the driver. | 

_ Now the, outer loop iterates over item-ig values determined by both drivers. ieee’ the: 
first record of each driver is read. There are three cases, distinguished by the:relative values of the: 
item-id keys in these records: , 
~ 33 Notice that in finalized loop describe there is no General section. 


, 


i 


dus wwwldbds 


(deciarations) - 

{ON conditions) 

(read CURRENTORDER and initialize LEVEL_2_MENINUN. ITEM = CURRESTORGER RECORD. ITEM; ) 
(read dais ond initialize LEVEL_1_MIWIWN, ITEM = ITERDENAND_RECORD. 1TEN; ) 


ates Sega ey cs uBR BYE? 
wa 


(code to set the syachientiatien fag for each level to ee i tts iver hed no records): 


(comparison. Of ITEM velues tp set syachrentzet ton Tags: 
IF LEVEL_2 UNIMON. ITEM > LEVEL) MINgOUN. ITEM 
YHEN BO; DOL EMRE 8 i= 1B. eee SO er syeger 
LEVEL_2 = '@'B;. 
LEVELS..1 THRO: 2: HU IUY, ETEM © LEVEL_2 L@INUN ITER; 
emo; : 
ELSE IF -LEVEL_ 2 sore remee, TEM C REVERT ISU, LER 
THER DO; BO_LEVEL_1 = ‘O'R; , 
ARR e 
LEVELS_1_THRU_2_ MINIMUM. TTEH * NOVEL, 2 AIWIHM. Lieto 
ENO: af ddewpe Gar pt oe Patete Sou ieee 
_ ELSE 90; BO_AEVEL_1 = ‘21'S; 
VEVEL_2 = ‘1'8; See ae 
LEVELS_1_THRU_2 MIN SOU. ite = teva: fueinen. ne 
up: ) i “ 


BO WHILE (LEVEL _1); Bee ae 
(reed PRICE record) ob ad a aaa 
Hf DO_LLEVEL 1 THEN (calculate value- shipped) fe dlig itv. pip 73 


an (LEVEL 23500 ec gO et bem pagans ge TAT TD 
IF FOUND .PRICE_RECORD THEN beNcalots and write avicediacerica) 
eo (reed CORRENTORDER, aed nose t LEVEL 32. UINIUN ATER + qe erdtoete Recon. avin: 
(check for eof). ; 
TF LEVEL_2_MINIMUM, ITEM > LEVELS 1: sin@.2 nel pri VEN: ten £E9RL F168; 
ELSE LEVEL_2 = '1°8; 
END /* LEVEL_2 */; : _ 
~ IF DO_LEVEL_1 THEW DO /* Epitog LEVEL_1 ¢/; 
; ALE BDIMEO WRCOESNEPPED: PRE tortie: ‘¥elue- shipped record) 
(read 1TEMOEMAND ond reset 
Be et ee A A ee 
END /* Epitog LEVEL_! ¢/; 


(synchronization code exactly as above) 
ERD /* LEVEL_1 */; 


Figure 6: IMustration of synchronization code for aggregated computations 
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PAY_COMP: PROCEDURE; 


DECLARE DSAGI INPUT FILE SEQUENTIAL RECORD, 
PAY OUTPUT FILE SEQUENTIAL RECORD; — 
DECLARE 1 PAY_RECORD, 
2 EMPEOVEE FIXED DECIMAL (4), 
2 PAY FIXED DECIMAL (4), 
1 QSAG1_RECORD, 
2 EMPLOYEE FIXED DECIMAL (4), 
2 DEFINED ALIGNED, 
3 HOURS BIT 11), 
3 OVERTIME BiT (1), 
2 HOURS FIXED DECIMAL 13); 
2 OVERTINE FIXED DECIMAL (3): 
* 2 EMPLOVEE FIXED DECIMAL (4), 
2 HOURS FIXED DECIMAL ()s 
DECLARE 1 EOF ALIGEED, 
2 DSAG] BIT (1) UNALIGNED INITIAL ('8'8); . 


ON -ENDFILE (OSAGL) EGF.OSAGL = '1°B: 
READ FILE (OSAG1) INTO {DSAG1_RECORD); 
BO WHILE (- EOF.DSAGL): 


IF OSACT . DEFINED. HOURS 
THEN 00; 


PAY_RECORD.PAY = OSAGL_RECORD.HOURS « 3.8; 

" PAY_RECORD-EMPLOYEE, = OSA FELD. BIPLINEE: 2 
URITE FILE (PAY) FROM. (PAY_RECORD) s 
READ FILE (OSAGL) INTO (DSAGL_RECORD) ; 


END: 
ELSE; 


READ FILE (DSAGL) INTO (0SAG1_ RECORD): . 


END ; 
END PAY COMP: 


Figure 7: PL/I code for PAY 1S HOURS = 3.88 with Aggregated Flow 
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’ WZ Core Table Access ST SR COMETS 
<a ey oe Ai Te OIE 


fe the file Mi, fer example. sae: Wd old ‘el 


dAeG. 403 ~) Sure 00 


i PRICE_FECOND (sane. ! | | 
1: a ee 3 
F PRE Free Gece «, . on So a 


| td the code to fill up this uit | : 
Be we a BRUOH GROTH. OREO YA. GROIS_YAS 


= wold bo-agergyA ditw 89.0 « GAUIH 2f YAT wi shoo FLITE C sug \ 
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used. If the sort orders are compatible the method of access is hak aa! analogous to sequential 
- access except that “records” are “read” from the tbl inseewe of secontaty storage (see Fig. 8). 

If the input file is “randomly” ervaniies tem (yy the aécess code generates a hash index 
and then mimics the PL/I access procedure: compare the key valoeh OF? Ree Wihicated table entry 
with the desired ones; if identical stop; otherwise examine sce thes int wrap-around fashion 


until an empty slot is found (end of the “Ducket)'9 or a compile cy has been made. If the sort 


orders are not compatible a more complicated binaty search .is iniplemented. 


V5.3 Random Access 

When the records of an imput are directly (regional (2)) organized the file is randomly 
accessed. Instead of using a leop, as with: sequential access, a single read, using a calculated key is 
executed. For example if the PRICE file in the STEREO computation (above) were 
randomly accessed, the accessing part of the code would be: 

PRICE_RECORD_HASH_VALUE = MOD (5 « (NOD (LEVEL_2_MINIMUM. ITEM, ),); 

PRICE_RECORDHASH_VALUE_STRING » PRICE RECORD_HASH_VALUE; 

PRICE _RECORD_HASH_KEY = 

LEVEL_2_MINIMUN. ITEM || PRICE_RECORD_HASH_VALUE_STRING; 

FOUND.PRICE_RECORD = °1°B; 

READ FILE (PRICE) INTO (PRICE_RECORD) KEY (PRICE_RECORD_HASH_KEY); 
The first three statements cakulate the source key string which has two parts: the region number 
(rightmost 8 characters) and the comparison key (the remaining characters). The case where the 
record is not present is handled by the statement: 


ON KEY (PRICE) IF ONCODE = 51 THEN FOUND.PRICE_RECORD = °8°B; 


which resets the FOUND flag if a "keyed record not found” error occurs. 


~ 
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IF EOF .PRICE 
THER DO; IF FOUND .PRICE_RECORD 
THEN IF PRICE RECORD WIBEX < = PRICE_RECORD_ SIZE 
THEN PRICE _RECORD_INGEX = PRICE RECORD _ IROEX + 1; 
ELSE &pt. PRICE. * 94°S: F 


- PRICE_RECORD_COMPARE: . 
IF EOF PRICE 
-_.TWEN, FOUND-PRICE, RECORD = 19°; 
“ELSE IF PRICE_RECORO.ITEM © LEVELS_1_TWRU_2_MINIMUM. ITER 
. ‘ : : we : Pye 3 
THEN FOUND .PRICE_RECORD = '1'; 
ELSE. IF PRICE_RECORD.2TEM-> LEVELS.) TORU_2.SS1NIMON: 3TGR 


THEN FOUND PRICE _RECORD = '9'B; 
ELSE BO; IF FOUND .PRICE_RECORD mes 
THEN IF PRICE RECORD_INDEX < = PRICE -RECORD_ SIZE 
oc TES: PRICE, RECORD INDER 2: : 
PRICE_RECORD_INDEX + 1; 
ELSE EOP PRIEE = 2B 


GO TO PRICE_RECORD_ COMPARE ;. 
eNO; 
£uD; 


Figure 8: PL/I Code for Reading PRICE by Core Table.in the Extended Price Computation 
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V.6 The General Case--A Summary 
We have seen that the basic code structure for a computation consists of the following four 
parts | | _ | 
declarations 
on-conditions 
loop initialization 
the nested loop** 
The basic structure of the body of each ig in the nested Woop is as follows: 
read & match non driving inputs 
Prolog calculations 
‘inner loop (if any) 
Epilog calculations 
‘write outputs 


read active drivers 


determine new active drivers 
and. index values for the next iteration 


loop synchronization code 


exit on EOF or (for inner loop) sub-index change 


35 It may be ‘interesting to note that ProtoSystem I's code generator generates these sections 
simultaneously as four separate output streams (rather-than sequentially) that ave atenated together 
when i they are all finished. — 

® There is no clean-up code following the loop because the end of the job ‘stép which is the - 
computation = everything necessary, including the closing of files. 
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Appendix I: The Simple Expositional Artificial Language (SEAL) 


As an aid to discussing loops wetaven an sails bapuads similar in form to traditional 
high-level languages such as ALGOL, PL/I and FORTRAN. The bask constructs of this 
language are: . 
| Kteration: expressed by the construct: 

for each <loop-index> from sede tlealeets 

<body> 

which has the anne: pectin the actions , contained in the + op foe each value of the <loop- 
index> cbt ined from the flows in the cérving-tVou-sat> ee the either the 
name of the index associated with the flows in the <dr iving-flou-set> or (for Teasons that 
become evident in this paper) a sub-index of corresponding sub-flows The set of values that the 
<loop-index> takes on is the union of the index sets of the drivers. This set is enumerated at 
execution time by reading sucienave records of the drivers. ee | 
YO and defined: input (record fetching) is expressed by the get operator, thus: 

get <var iable-instance> | 
where <var i able-instance> specifies a flow and a particular value for its index, represented as : 
variable (see below). A statement like this means: fetch the indicated retord if itexists 

Output is expressed by the wri te operator, similarly: 

write <variable-instance> 

The defined operator is a aa operator for use in ‘conditional Spree It is 

applicable only. to flow. variable instances. The form 


de fined {<var table-inetance>] 
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evaluates to “true” if the specified record sae indicated flew exists. -In particular, if the record is 
an input (obtained shaigha oat) it is: “defined”. if-and-only df the! get:sudceeded; if the record is 
an output it is “defined” if and only if the generating code produced a datum for the rected: 
Conditional Execution: expressed: by the familiar i f~ then-e | se construct: 


if <condition> then-.<statement-list>): 
else <statement-list>, 


which means that if the logical oe <condi tion> evaluates to “true” Biles the statements 
in <statement-list>,; otherwise, perform the Satements'in eétotenant- Mates 


Logical expressions can be 2 aaa eaciions ' comparison operators, the defined 


operator, and the logical connectives and, or and not. 
Conditional Expressions: eae by the construct: 


if <condition> then <expression>; 
else <expression>s. 


which evaluates to the value of <expreasion>, if the logical expression <condition> evaluates to 
“true” and to the value of <expression>, otherwise. | 
Variables and Assionment expressed by the construct: 

| <variable> = <expression> 
where = is the assignment operator. 

A variable can be either a scalar or an indexed variable. Flows are represented. as indexed 
variables with an index identical to the flow's index. Thus, DEMAND{itemsid, store-id) is the 
variable corresponding to the DEMAND flow. and an instance of its: index: selects: the datum of the 
corresponding flow record. That. is, for example, the statement. 


DEMAND (1234, S678) = CURRENTORDER (1234, 5678) + 
BACKORDER (1234, 5678) 
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means that the datum of the record of DEMAND for itern 1234 ordéred by store «5678 is to get the 
value obtained by adding: the data of the. corresponding records fréin’ CURRENTORDER and 
BACKORDER. 
Typically, the record-by-record computation implied?by a' HIBOL flow equation would ook 
like that equation translated into our artificiak nt language {with ‘a generalized-index), such as 
DEMAND (i tem-id, store-id) = — . 


it def ined [CURRENTORDER (i tem-id, etore-id)} 
and: def.ined BACKORDER (i tew-td,: storesidl 


‘then. CURRENTORDER Citer-id; storecidh + 
BACKORDER (i tem-id, eerie 


etse if “def ined ICURRENTOROER i tea-id, eipetah: 
then CURRENTORDER (i tew-Id, otors-id 
else if eV ieat@nGhOnDEn ti ven ta; store-id)} 
“then BACKORDER ( }tes=id, store-id)- 
else undefined - 
and would appear somewhere in the body of loop. 
' Sub-flows: A sub-flow (for use in the for each ae is expressed by: 
| <f low-var ji able> {<sub-index>) 
For example, 
-— CURRENTORDER i tem—id) 
denotes the sub-flow of CURRENTORDER consisting of jest those records whose indices correspond to 
the value of the sub-index (item-id). Generally, the value of the indicated sitb-index is fixed by an 


enclosing loop. 
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